[04:24:24] We should add some further retry logic to our rolling operation elasticsearch cookbook. Most common failure scenario is `elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='search.svc.eqiad.wmnet', port=9243): Read timed out. (read timeout=60))` which seems like a pretty easy error to detect and retry a few times on [09:14:47] dcausse: you where right about the failing rdf-spark-tools tests. After replacing the constructor calls with a builder chain, it compiles again. [09:41:40] weekly update: https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2024-06-07 [09:43:11] ryankemper: what operation causes timeouts? 60 seconds seems pretty long already [09:44:26] We should still implement retries, but I'm wondering if we have an underlying issue [10:13:58] lunch [12:52:03] hm... was about to re-deploy the cirrus-streaming-updater to staging (for https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1039727) but I realize that that might enable the sanitizer there [12:52:10] not sure we want it there... [13:00:59] pfischer: if you're around https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1040151 [13:11:38] o/ [13:52:04] dropping off my son, back in ~20 [14:24:30] back [14:33:13] \o [14:33:24] dcausse: I noticed that yesterday, already set up a patch: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1039736 [14:35:20] pfischer: oh thanks! [14:35:22] o/ [14:35:55] will get that deployed to upgrade staging [14:36:08] gotta drop off the other kid ;) . Back in ~30 [14:55:04] hmm, in the mediawiki page_change events for the prior state of a page move, do we think it should always include the namespace id or only if it changed? [14:55:18] Currently only the page title is included there, but we need the old namespace id to know if it moved between indices [14:56:10] seems like it should simply always be there for consistency, allow consumer to compare .page.namespace_id against .prior_state.page.namespace_id [14:57:32] yes, makes to me [14:57:36] *sense [14:59:19] dcausse, pfischer: last minute, but if you want to chat about Search and languages, feel free to join [14:59:41] I just sent you the invite [15:01:10] back [15:44:47] gehel: I’am sorry, I was AFK [15:46:09] BTW: looks like rate-limiting via envoy is now ready https://phabricator.wikimedia.org/T362310#9870761 - shall we enable this in general or only for backfill setup? [15:49:40] heading back home, be there in ~30 [16:09:21] pfischer: no objections to enable it everywhere but is the pipeline ready to slowdown on 429 or will we fail more events? [16:12:29] going offline, have a nice week-end [16:16:12] back [16:17:14] hmm, deciding what counts as language support is not easy...in a way glad they asked for binary. I was thinking binary isn't specific enough, but then choosing an appropriate divider is hard [17:07:45] lunch, back in ~40 [17:21:28] * ebernhardson tries turning off FSLockManager on cindy...seems like many of the tests are failing on upload due to it [17:56:11] ebernhardson: I just noticed, there are two PRs for sending a distinct user agent with requests originating from the SUP: https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests [18:00:35] Looks like your's is offering greater flexibility, I'll discard mine. [18:02:20] pfischer: i just saw that as well, not sure how i missed yours was already in MR [18:15:29] back, working from my 3rd venue today! It's a new record! [18:27:34] lol [18:37:33] too many summer kids' activities ;) [18:40:20] * ebernhardson realizes while looking at this that event page titles are namespace prefixed, and our redirect update handling isn't stripping them [18:43:19] ebernhardson: which redirect handling? SUP or cirrus? [19:21:18] gehel: It seems to be the flushing markers causing the timeout [19:21:28] https://www.irccloud.com/pastebin/qXmZPRTW/stack_trace.log [20:15:34] pfischer: in SUP [20:16:29] pfischer: the redirect_page_link fields is prefixed db key, so namespace and underscores, but we load it into the TargetDocument.pageTitle which is mostly unused, except in the case of add/remove redirects [20:19:22] for extra fun, `Kill Bill: volume 1` and such things have :, but not to delimit the namespace. But as long as we get the namespace id we can then strip when ns > 0