[11:26:01] FYI https://addshore.com/2023/08/wikidata-query-service-blazegraph-jnl-file-on-cloudflare-r2-and-internet-archive/ and https://phabricator.wikimedia.org/T344905 :) [13:06:50] addshore: thanks! If you could add a bit more on T344905 about why that's useful, it would be great! [13:06:50] T344905: Publish WDQS JNL files to dumps.wikimedia.org - https://phabricator.wikimedia.org/T344905 [13:19:42] gehel forgot to ask yesterday, but where did you get that gerrit plugin for intelliJ? [13:21:13] inflatador: directly from settings->plugins [13:21:23] inflatador: https://plugins.jetbrains.com/plugin/7272-gerrit [14:07:44] ebernhardson: As it turns out, our primary stream of events of rev based updates is keyed by wiki_id and page_id, and therefore those kafka messages should be distributed amongst the 5 partitions based on hashing that key. Now I wonder about the other primary (pre-aggregator) sources. Do you know where article-topic-stream, draft-topic-stream, and recommendation-create-stream are produced? [14:08:46] https://wikimedia.slack.com/archives/C02291Z9YQY/p1692881354121089 [14:18:58] off, back in 40’ [14:57:29] Hey, just wondering if folks have seen https://phabricator.wikimedia.org/T344882 [14:57:33] \o [14:58:02] WCQS has been broken for about a day, at least, so I was hoping to get some eyes on it. [14:58:32] DominicBM: hadn't seen that, we can take a look. There are some reconciliation jobs that started failing recently that could be related, unsure [14:59:48] Thanks so much! [15:00:51] DominicBM not sure what happened there, but I'm going to start a data transfer from wcqs1001 to wcqs1002, should take about 2 h [15:01:28] can copy the data to get them all in sync, but need to understand how they get out of sync :S [15:02:23] for sure [15:02:54] I can depool and leave wcqs1003 broken if that helps the troubleshooting process [15:03:13] * ebernhardson wants fancy toys like Merkle Tree's to monitor consistency, but i can't figure out how :P [15:03:43] most likely explanation is I have forgotten to run the data transfer after the reimages a few wks back, but will need to verify [15:09:54] I keep moving whitespace around to trigger a results refresh when WCQS is giving me 0. Usually takes a few tries. '=D [15:12:33] DominicBM I'm depooling wcqs in eqiad for the moment [15:45:09] i'm wondering while looking over the rdf-streaming-updater ... where exactly does it throw out events that are not about entities? [15:45:55] the bit to be more strict about what constitues an entity id is ready, but if we ship that without fixing the underlying issue it only moves the failure earlier without fixing anything :) [15:57:24] Ohhh...what happened is we limit by namespace id. In wikidata Q items are in the main namespace, which also holds Main_Page. A revision of Main_Page was undeleted and went into the pipeline [15:57:41] so the problem is the entity namespace has non-entity pages [16:11:45] ebernhardson is that a problem limited to WCQS? sounds like it could happen with WDQS too? [16:17:43] inflatador: it's actually limited to wdqs, probably because wcqs has a dedicated entity namespace instead of using namespace 0 like wikidata [16:18:29] ebernhardson ah OK, I thought this was in reference to the WCQS triples issue [16:22:39] inflatador: i thought they could be related, but on looking closer this part is only wdqs [16:28:59] eating lunch, back in ~1h [17:18:47] back [17:28:30] ebernhardson how's the reindex going? I tried to peep on your mwmaint user but I'm afraid to break your tmux windows [18:03:39] :) i barely know how this works [18:04:10] inflatador: you can get a rough idea from the ~ebernhardson/cirrus_log dir. I suppose i don't always clear it out, but it shuld only have logs for the current run [18:05:05] it's still going slow :( We are up to elwiki/enwiki/enwikisource on eqiad/codfw/cloudelastic [18:05:35] i suppose `ps auxf` also gives that info [18:07:08] ebernhardson Got it, thanks for the hint [18:07:46] on the one hand i feel like should understand what happened to make this so much slower. On the other hand, it sounds like a deep hole to dig into :P [18:08:35] Could the new mapping things be more expensive perhaps? Once it's done might have to look at how latency of shipping updates changes [18:58:05] hmm, lexemes dump never showed up in the dumps, failed the ttl import into hdfs, and failed the downstream tasks...going to copy the previous weeks import to this week in the hive table so everything can continue on against an older dump [19:20:45] damn! If you wanna file an SRE ticket for that, I'm happy to run int down. I need to learn more about that process anyway [19:29:33] Ben has been learning more about the dumps stuff for ceph deploy, so it would be a good excuse to get into that as well [19:30:20] i was going to ignore the question of why that dump didn't show up :) [19:31:44] As a SWE, absolutely! But that does sound like SRE/sysad work ;) [19:32:59] anyway, only if you feel like it's worth the effort. No offense either way [19:36:04] Is it normal for the rdf repo jenkins run to take ~15m? https://gerrit.wikimedia.org/r/c/wikidata/query/rdf/+/949102 [19:37:15] inflatador: if you open up https://integration.wikimedia.org/zuul/ and search for `rdf` it will show the in-progress monitor. It says 'ETA: 6min` which means based on history it takes around that much longer [19:39:52] It's also in the "checks" table in gerrit. There's also a new "findings" tab which is apparently for bot comments [19:41:47] maybe we'd use that if we weren't getting away from gerrit [19:57:02] perhaps randomly interesting, the set of topics returned to the chatgpt plugin over the last week, ordered by the # of unique searches performed: https://phabricator.wikimedia.org/P51413 [19:59:40] (someone was pulling more specific versions of that data and asked about it, couldn't help but write up a query :P)