[07:26:23] o/ [10:25:51] lunch [13:49:22] \o [14:20:01] 2nd try on comment label for flink, turns out last patch added it to the pods but not the flinkdeployment, and we need the flinkdeployment for status checks: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1180145 [14:23:51] back [14:23:55] o/ [14:28:48] water is never fun...good luck! [14:29:15] yes... thanks! [16:46:01] hmm, i need a parameter name but not coming up with anything. We usually invoke get_flink_deployment_status with the full labels which includes release and comment, but for checking if the backfill is released i want only the release... [16:48:27] just trying to explain the difference between the two in comments is awkward :P [16:52:30] a generic "extra_filter" as a dict so that you let the caller deal with the problem of explaining what comment is? :) [16:53:37] i added a `BackfillRunner.get_any_flink_deployment_status` to pair with `BackfillRunner.get_flink_deployment_status`, but the naming is totally misleading [16:54:47] get_any vs get_*my* perhaps? :) [16:55:31] sounds tedious to name properly indeed :/ [16:56:59] * ebernhardson separately worries i've spent far too much time on this reindex orchestration...but in too deep now :P It's mostly working at least [17:00:08] well those things are always hard to get to a state that is robust enough so totally understandable, reminds me of cindy and the efforts required to make it stable enough [17:09:15] at least it seems to be working, it's currently doing a --force-reindex against @closed and it just working through things [17:17:07] * ebernhardson is avoiding the temptation to continue re-working the architecture...there is unnecessary redirection but i think will just leave it :P [17:26:10] sure :) [18:01:08] \o [18:19:11] o/ [18:28:42] dinner [18:38:34] I'm still log-diving for cluster quorum clues. I haven't found any "failed to join" messages like we see locally on the masters, but that's probably because we just have the one master (cirrussearch1074) that has a working logstash [18:38:59] Eating a quick lunch, but ryankemper if you wanna poke around I've been using https://logstash.wikimedia.org/goto/9da7177c09f67c2029c897f8719db4ed [19:04:47] ebernhardson: re: reindexing orchestration... as someone who hopes to eventually use your reindexing orchestration when it's done, I will appreciate all the bells, whistles, checks, balances, guardrails, flashing warning lights, etc. you add to it. And if we ever get around to automating wikidata/commons or other reindexing, it'll all be even more valuable. [19:20:54] back [19:33:47] I just depooled eqiad and I'm gonna finish the rolling restart in a few min once traffic goes back down [19:39:30] Vet appt got pushed up, leaving in 30 mins and back 1.5 hrs after that [19:39:42] inflatador_: i’ll be around next half hour if you need extra eyes [19:43:34] ryankemper cool, everything going smoothly so far. If you wanna look at logstash and see if you can find a pattern, otherwise we can talk about it tomorrow [19:47:13] rolling restart is done [19:47:43] just repooled eqiad [21:20:50] I created `trixie.search.eqiad1.wikimedia.cloud` in WMCS if anyone wants to play around with trixie. I'm just using it to see what packages are available