[06:46:42] oh right mediasearch encodes its ui filters in the query, perhaps time to add a new param to the search APIs for context filtering to avoid mixing those with actual user input [09:19:58] off to my daughter's end of school year ceremony [09:20:43] zpapierski: have fun! [09:52:34] Lunch [10:30:29] lunch [13:44:45] dcausse, zpapierski: I could use some help to organize phab tasks around click model. If you have a moment to jump in meet.google.com/uxn-rxqq-ivg [14:38:34] Break before unmeeting [15:26:57] \o [15:27:11] * ebernhardson is surprised to find 90G's of thumbnails in hdfs instead of a failure message this morning [16:09:16] going offline [19:10:50] mildly annoying, most things accept hdfs:///path/to/thing as the default cluster. But pyarrow'd hdfs connector says 'Expected authority at index ....' :( [19:22:52] on the other hand surprisingly easy now to talk to hdfs/parquet in plain python. kinda nice :) [19:58:14] hi! we in Growth are seeing that topic search seems to have stopped working on some wikis (e.g. rowiki), and we're not sure why [19:58:27] T285577 is the relevant task [19:58:28] T285577: Several wikis have 0 articles for all ORES topics - https://phabricator.wikimedia.org/T285577 [20:02:32] cc ebernhardson, if you're around [20:04:55] kostajh: hmm, ok can check [20:05:09] * kostajh re-reads https://wikitech.wikimedia.org/wiki/Search/articletopic [20:08:43] kostajh: since this came up after train deploy, it's gotta be our removal of BC handling. We can unrevert that [20:08:51] err, i mean we can revert and undeploy [20:09:37] ebernhardson: ok. this one? https://gerrit.wikimedia.org/r/c/operations/puppet/+/697836/ [20:10:27] kostajh: i'm thinking https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/661397 [20:11:22] ah [20:11:30] we indexed all the data into the new fields, or so we thought, but apparently not based on what we are seeing on rowiki. Figuring out why exactly it didn't work will take some time, but i think reverting that patch will keep querying the old field until we figure it out [20:11:55] ebernhardson: ok. Marshall made a list of which ones are broken and which aren't, in case that's helpful https://phabricator.wikimedia.org/T285577#7178267 [20:12:30] it will also mean undeploying the drafttopic: keyword for a bit, but probably fine [20:12:38] kostajh: do you want to revert now, or monday? [20:12:40] ebernhardson: if you think reverting will fix it & are able to do that today, that would be appreciated by us as we lean on this feature pretty heavily [20:12:48] ok [20:13:14] fyi i'm currently deploying a fix for a related issue (basically the Growth being more broken than necessary) [20:18:16] hmm, no i don't think that revert is going to help. Reviewing an expected biology page (https://ro.wikipedia.org/wiki/P%C4%83ianjen?action=cirrusdump) has classification.ores.articletopic/STEM.Biology|999 in the appropriate field...hmm [20:20:04] ok, actually it will help. rowiki (and others) must have failed reindexing, as they dont have a weighted_tags field defined in their mappings [20:20:18] so the data is shipping, but the index configuration wasn't updated [20:23:26] ebernhardson: I don't know if this is somehow related, but linking to T285538 in case that has some bearing on this [20:23:26] T285538: All cronjobs using foreachwikiindblist broken in production: Fatal error: Uncaught Exception: MWWikiversions::readDbListFile: unable to read . in /srv/mediawiki/multiversion/MWWikiversions.php:94 - https://phabricator.wikimedia.org/T285538 [20:23:52] kostajh: its related, in that foreachwiki* is about the worst possible way i can think of to reliably do something on 900 sites. And we use it too :) [20:24:30] the script was only broken when used via `mw-cli-wrapper` :) [20:24:54] (which is used by systemd timers to ensure they run in the active DC only) [20:26:02] ebernhardson: i'm not sure if you intend to pursue deployment of the revert today, but if you do, I suggest to ask in -operations (per https://wikitech.wikimedia.org/wiki/Deployments/Emergencies, SRE and releng approval needed) [20:26:11] (rzl and dduval signed off deployment for me) [20:30:58] * ebernhardson misses the days when ori just deployed on saturday because he had an itch to scratch :P [20:32:58] hehe [20:44:33] ebernhardson: thanks for the revert and deploy, much appreciated! [20:45:19] kostajh: no worries [20:45:53] it's a kinda amusing how prescient it was though...two days ago we spent an hour or so talking about how to make this reindexing process more reliable, observable, etc. And now more proof the current process doesn't work :) [21:04:19] ha