[10:35:06] errand+lunch [10:58:34] lunch [14:12:25] dcausse Trey314159 gehel some results from my MLR experiments https://phabricator.wikimedia.org/T383048#10500714. Sadly, now what I was hoping for. [14:47:52] o/ [14:47:55] gmodena: thanks! looking [14:51:17] gmodena: I'm not sure I understand why you have separate metrics for easy & hard queries? [14:53:51] it feels that you trained easy & hard queries separately [14:54:59] mean ndcg is still very high on hard queries [14:58:39] we can discuss all this tomorrow [16:01:59] dcausse I'm writing things up in notebook right now. I'll circulate a doc tomorrow [16:02:18] all queries are trained together [16:02:28] ack [16:03:15] and the overall ncdg (cross folded) is stable [16:03:25] i just kept track of perf on the two different query types for reporting purpose. I wanted to see if the weighting step would have impact either of them [16:05:06] tbh I'm surpised by how well such an un-optimized model performed to begin with. [16:07:19] very possible that user-behavior is binary: I find what I want in the top-3 I click I abandon otherwise [16:09:01] abandonned queries won't be in the training set such that you get only things where users clicked on something with a decent position [16:12:58] dcausse ah! true [16:14:01] but the easy query segregation clearly shows that we're almost perfect on them (0.99 ndcg) [16:14:45] yep. mjolnir does use title_match (and derived) features. They def help [16:16:23] (and fwiw exploring feature importance of the weekly trained model. I also see that poppularity_score and incoming_links are the 1-2 most importance feature post tuning ) [16:40:51] yes I think these two are pretty important, Erik did some analysis on the features by looking at all trained models by mjolnir since 2020 maybe? was pretty insightful to understand what features are reliably interesting [16:41:27] can't find a paste with this (that's usually where he uploads this stuff), might be somewhere in his notebooks perhaps [16:42:11] heading out, back later tonight [16:48:09] dcausse ack! [16:49:14] dcausse let me know if I need to merge the SUP schema MR (not sure about permission on that gitlab repo) [18:31:52] lunch, back in ~40 [19:12:16] back [20:29:06] gmodena: yes, just added a small on the dev schema but whenever you get some time, would love to have this patch merged [20:29:31] (I think I can merge on this repo) [20:30:20] yes I do, just merged it