[06:50:08] my electric meter is about to be changed, I'll have no power for about 1 hour [07:17:29] back [08:25:13] looking at the traffic on wdqs2021 I think it's still depooled [08:25:29] cc ryankemper, inflatador ^ [09:17:50] ebernhardson: could you prepare an update for the overall Search work? Same link, slide 13? Off the top of my head: weighted tags, SUP on private wikis, Language work, mess around Elasticsearch vs OpenSearch, ...) [09:18:06] Or is there anyone else willing? [09:18:40] Slides at https://docs.google.com/presentation/d/1xL5LVXux33g_EdWsYDIERBpt2fqQ2u38VEEp5448c1I/edit#slide=id.g2e747bf5a7e_0_23 [09:53:24] lunch [13:14:42] /moti wave [13:14:42] dcausse 👀 [13:15:46] o/ [13:18:34] dcausse just pooled wdqs2021 [13:19:19] inflatador: do you if it's related to the data-reload ran a couple days ago and if the cookbook might be the culprit? [13:19:24] *know [13:20:15] dcausse let me check which host I ran the reload on, I thought it was a full graph host I used but could be wrong [13:21:40] I see "15:58 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T364077, testing new flag; this should succeed) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet, repooling both afterwards" from https://wikitech.wikimedia.org/wiki/Server_Admin_Log [13:21:40] T364077: Adapt the wdqs data-transfer cookbook to operate with federated subgraphs - https://phabricator.wikimedia.org/T364077 [13:22:09] dcausse ACK, that must be it...I ran the reload against `wdqs2020` fwiw [13:22:16] ok [13:23:35] I didn't get to the categories reload on wdqs-categories1001 yesterday, looking at it now [13:23:43] thanks! [13:57:18] OK, categories reload is started. We'll see how long it takes in Ganeti [14:17:39] \o [14:18:10] gehel: i'll get that done today, it's for monday right? [14:21:37] .o/ [14:21:49] quick break, back in ~20 [14:23:56] o/ [14:42:49] hmm, will need to look how the host didn't get re-pooled [15:14:57] sorry, been back awhile [15:23:39] ebernhardson: yep, it's for Monday [15:49:49] categories reload still going... [16:01:43] dcausse does `reloadDCAT-AP.sh` normally use the default blazegraph instance for the host, or categories? Just wondering as I see `:9990` in the script and I think that's the main instance port? [16:02:23] inflatador: lemme check but 9999 IIRC should be the main blazegraph instance [16:03:03] 9990 should be categories [16:03:29] OK., I must have gotten those mixed up [16:04:08] categories reload is done...took almost exactly 2 hrs. I'll check the dashboards but I don't think memory went higher than ~10GB [16:04:46] also realoadDCAT is not the categories reload script but the dcat namespace which is yet another graph... [16:06:00] but for simplicity I think we should consider this as part of the categories endpoint (it's served from the same blazegraph instance as categories) [16:06:16] OK, that was my next question [16:06:26] inflatador: what's the size of the journal in the end? [16:07:15] 26G ... same as last time I did the reload [16:07:24] ok [16:08:20] if blazegraph did not blow up during the reload I guess that's enough resources? [16:09:28] Yeah, although I'm a little surprised it didn't get up into the higher ranges we've seen on the dashboard. I guess because it didn't have to service traffic at the same time it was loading? [16:11:15] inflatador: yes most probably only querying the graph will actually start using some mem [16:36:24] hmm, now that we have some metrics to look at should try and deploy a new ML model next week, hasn't been updated too recently. Would also be interesting to AB test old-model vs new-model vs no-model, but not sure we have the analysis portions needed. Maybe should test if at least interleaved testing still works [16:38:34] good idea [16:40:27] ouch, much older than i thought :S 20220421 [16:40:45] maybe we need a ticket-generator like used for index-too-old [16:45:10] or there is the crazy idea to push the current model name into metastore from mjolnir-bulk-daemon and have the instances query/cache the current model name [16:45:42] i suppose model aliases on the elasticsearch side would be a nice way to avoid that, but perhaps for another day :P [16:46:09] lunch, back in ~90 [16:48:40] model aliases might be nice indeed but I bet it might require to make them available in the cluster state somehow, something that might be feasible from plugin [16:48:55] *not* be feasible from a plugin [16:50:22] elastic added a model service in elastic-core when building their ltr implementation, might be interesting to look at what they've done [16:50:36] hmm, oh right. Index aliases are cluster state. Doing that from the plugin would require similar [16:50:49] I wonder what the limitations are on reading their AGPL code (once released) and re-using the ideas [16:50:52] in non-AGPL code [16:51:09] yes I wonder that too :) [16:51:12] My suspicion is we aren't supposed to, i've kept my local copy of elasticsearch code at 7.10 to avoid looking at it [18:01:30] random notes from abandoned queries...user searches for 'mr melody man bjørn Skifs' then 'mr melodi man bjørn skifs', then abandons instead of searching for plain 'Björn Skifs' which has it's own page (on no.wikipedia.org) [18:01:44] users maybe not as smart as we assume :P [18:02:07] or i suppose, they aren't expected a perhaps historical style of search where all words are considered important [18:47:49] back [19:04:28] * ebernhardson for some reason thoughts there was a list of wiki's sorted by pageviews somewhere but not finding it...can always calculate from wmf.pageview_hourly but thought it was already available [22:11:34] posted a rough draft of what categories on k8s might look like ( https://drive.google.com/file/d/1FDGCeS9JC7hlwh--WsfiGKhJfhSXapEo/view ) . Feel free to add suggestions on the Google doc if you think there's a better way https://docs.google.com/document/d/1VfGuc8DniKk4xsD6Ki28OYBuWRe4v2bOoFLJ8K5Dk5o/edit [22:20:33] how odd, google docs says it can't preview the image, but it's a pretty simple png (can still download and view locally) [22:24:54] * ebernhardson looked, has no useful commentary that wasn't already included :P