[04:05:51] FWIW whatever caused the search slowdown in GrowthExperiments disappeared after a week: https://grafana.wikimedia.org/d/vGq7hbnMz/special-homepage-and-suggested-edits?orgId=1&from=1677974400000&to=1679270399000&viewPanel=56 [10:44:37] lunch [10:55:18] lunch 2 [12:48:54] o/ [13:00:28] seeing some wdqs alerts...hopefully it's not time for another outage [13:22:53] restarted blazegraph on wdqs1005/1006 and the alerts cleared...will keep one eye on the dashboard though [13:24:50] o/ [13:25:46] dcausse apparently I forgot to add today as a pairing day...can meet in ~30m though if you like [13:26:14] inflatador: sure when you want :) [13:30:54] wdqs1005/1006 have ~12h of lag, hopefully that will drop after the restart [13:32:36] o/ gehel: Looking at https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/292#note_21766 - Who or what determines where we release a (maven) artefact (archiva vs. central)? Is there a policy? [13:35:30] inflatador: we should keep servers that are lagged depooled [13:37:27] impact is that bot edits are going to drop otherwize (https://grafana-rw.wikimedia.org/d/000000170/wikidata-edits?orgId=1&refresh=1m&from=now-3h&to=now) [13:38:53] dcausse ACK, will depool shortly [13:39:06] thanks! [13:40:21] OK, they are depooled [14:04:25] dcausse up at https://meet.google.com/oxm-jadj-jtp if you wanna join [14:04:32] sure [15:02:16] \o [15:23:15] fixed up the ores dags on an-airflow1001, my attempts to get it going on 1005 (now waiting on https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/304) left a file in place that blocked it from running when i turned it back on in the old instance [15:44:23] o/ [16:01:06] SRE meeting confl; will be at SRE [16:05:16] network issues over here [16:19:29] dcausse: i dunno how helpful it would be, but last time around for full-cluster reindex i wrote mwmaint1002:~ebernhardson/reindex-20230209/reindex-fn.sh which provides a bash function that spins up a tmux session with 9 reindexes running in parallel (commons, wikidata, everything else * eqiad,codfw,cloudelastic) [16:19:43] might have to run inside an existing tmux session, i suppose i didn't test it from a plain start [16:20:19] ebernhardson: thanks! will probably use this [17:33:35] lunch/errands, back in ~1h [17:39:13] inflatador: I think I found where this watchNamespace var is being populated from: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/901253/ [17:39:54] aww, mjolnir almost worked in airflow 2, just the final upload step failed. Same problem (and same fix) that the ores one ran into, fixing templating in SimpleSkeinOperator [17:43:19] ebernhardson: if this requires https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/304 Andrew is out this week so perhaps we should ping someone else? [17:44:24] dcausse: oh! i didn't realize andrew was out. Hmm i'll have to think of who else to ping, Andrew wrote the bit that had the operators extend from PythonOperator instead of BaseOperator, and tbh i don't understand why that would be done but there might be a reason. I suppose i'll ping millimetric [18:08:01] dinner [18:44:28] back