[00:08:12] 10Data-Engineering, 10Data-Platform-SRE: Write a design document relating to superset on dse-k8s - https://phabricator.wikimedia.org/T349396 (10BTullis) I feel that I have now finished [[https://docs.google.com/document/d/1PT9cRVFtN23GlWfYo-_bTUzVcK12-dSSJcX-SV4rtqs/edit|this Superset on Kubernetes design docu... [00:23:25] (DiskSpace) firing: Disk space an-web1001:9100:/srv 5.37% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-web1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [01:32:28] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-codfw: Q2:rack/setup/install elastic2087-2091 - https://phabricator.wikimedia.org/T349778 (10Jhancock.wm) [01:50:57] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-codfw: Q2:rack/setup/install elastic2092-2109 - https://phabricator.wikimedia.org/T349780 (10Jhancock.wm) [01:53:14] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-codfw: Q2:rack/setup/install elastic2092-2109 - https://phabricator.wikimedia.org/T349780 (10Jhancock.wm) server elastic2096 is having an issue with the provisioning script. I did check the cable and tried redoing the netbox script. mgmt ip is still unpingable. g... [02:59:22] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install elastic110[3-7] - https://phabricator.wikimedia.org/T349777 (10VRiley-WMF) elastic1103 Rack: D 4 Position: U 17 CableID: 230304500226 Port 46 elastic1104 Rack: E 1 Position: U 14 CableID: 20220222 Port: 2 elastic1105 Rack: E 5 Posit... [03:00:22] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install elastic110[3-7] - https://phabricator.wikimedia.org/T349777 (10VRiley-WMF) [04:23:26] (DiskSpace) firing: Disk space an-web1001:9100:/srv 5.371% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-web1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [08:23:26] (DiskSpace) firing: Disk space an-web1001:9100:/srv 5.371% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-web1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [09:38:01] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: Create a keytab for each spark-history-server and add it to the puppet secret hieradata - https://phabricator.wikimedia.org/T351816 (10elukey) >>! In T351816#9359401, @brouberol wrote: > We will still test creating a keytab with an fqdn like `..<... [09:49:01] 10Data-Engineering, 10CX-cxserver, 10Citoid, 10Content-Transform-Team-WIP, and 10 others: Migrate node-based services in production to node18 - https://phabricator.wikimedia.org/T349118 (10elukey) I filed https://gerrit.wikimedia.org/r/c/mediawiki/services/recommendation-api/+/977751 for the recommendation... [09:57:49] 10Data-Platform-SRE, 10serviceops, 10Discovery-Search (Current work): Enable mediawiki.cirrussearch.page_rerender.v1 on all public wikis - https://phabricator.wikimedia.org/T351503 (10pfischer) a:03pfischer [10:03:01] (03CR) 10Joal: [C: 03+1] "Thanks a lot Marcel :)" [analytics/refinery] - 10https://gerrit.wikimedia.org/r/977774 (https://phabricator.wikimedia.org/T351909) (owner: 10Mforns) [10:09:23] (03CR) 10Sg912: [C: 03+1] Quick fix to refine_webrequest_hourly for exclude_row_ids [analytics/refinery] - 10https://gerrit.wikimedia.org/r/975418 (owner: 10Mforns) [10:23:05] 10Data-Platform-SRE: Reduce impact of Elastic snapshots - https://phabricator.wikimedia.org/T351475 (10MatthewVernon) I don't expect the change to make difference to how anyone is using swift - moving from nginx to envoy for TLS termination was more about bringing swift more up-to-date in terms of TLS terminatio... [10:37:42] 10Data-Engineering, 10Trust and Safety Product Team: Mitigate unwanted side effects to anti abuse work from Google Chrome's IP Protection rollout - https://phabricator.wikimedia.org/T350428 (10kostajh) See also [this comment](https://github.com/GoogleChrome/ip-protection/issues/31#issuecomment-1819513175) from... [10:49:21] 10Data-Engineering, 10Data-Platform-SRE, 10Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710 (10BTullis) [10:49:36] 10Data-Engineering, 10Data-Platform-SRE: Write a design document relating to superset on dse-k8s - https://phabricator.wikimedia.org/T349396 (10BTullis) 05Open→03Resolved [10:54:53] 10Data-Platform-SRE, 10serviceops, 10Discovery-Search (Current work): Enable mediawiki.cirrussearch.page_rerender.v1 on all public wikis - https://phabricator.wikimedia.org/T351503 (10pfischer) @elukey, we would like to start populating this kafka topic on kafka-main. Enabling `page_rerender` is the last mis... [11:43:05] 10Data-Platform-SRE, 10Patch-For-Review: Deploy additional yarn shuffler services to support several versions of spark in parallel - https://phabricator.wikimedia.org/T344910 (10CodeReviewBot) btullis merged https://gitlab.wikimedia.org/repos/data-engineering/conda-analytics/-/merge_requests/36 Remove the shu... [12:04:03] 10Data-Platform-SRE: Upgrade Spark to a version with long term Iceberg support, and with fixes to support Dumps 2.0 - https://phabricator.wikimedia.org/T338057 (10BTullis) Update: We have now got three versions of the spark shuffler running in production: * 3.1.2 * 3.3.2 * 3.4.1 Our production pipelines are al... [12:13:56] 10Data-Engineering, 10Data-Platform-SRE, 10Epic: Migrate the Analytics Superset instances to our DSE Kubernetes cluster - https://phabricator.wikimedia.org/T347710 (10BTullis) [12:15:48] 10Data-Platform-SRE: Create a superset container image using the PipelineLib framework - https://phabricator.wikimedia.org/T352165 (10BTullis) [12:17:46] 10Data-Platform-SRE: Create a helm chart for Superset - https://phabricator.wikimedia.org/T352166 (10BTullis) [12:19:20] 10Data-Platform-SRE: Create a superset container image using the PipelineLib framework - https://phabricator.wikimedia.org/T352165 (10BTullis) p:05Triage→03High [12:19:31] 10Data-Platform-SRE: Create a helm chart for Superset - https://phabricator.wikimedia.org/T352166 (10BTullis) p:05Triage→03High [12:20:16] 10Data-Platform-SRE: Create a superset container image using the PipelineLib framework - https://phabricator.wikimedia.org/T352165 (10BTullis) a:03BTullis [12:22:20] 10Data-Platform-SRE: Bring an-coord100[3-4] into service - https://phabricator.wikimedia.org/T336045 (10BTullis) a:03BTullis [12:23:59] (DiskSpace) firing: Disk space an-web1001:9100:/srv 5.371% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-web1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [12:31:16] 10Data-Platform-SRE, 10sre-alert-triage: Alert in need of triage: SmartNotHealthy (instance an-worker1086:9100) - https://phabricator.wikimedia.org/T352168 (10LSobanski) [13:37:05] 10Data-Platform-SRE: ProbeDown - https://phabricator.wikimedia.org/T352083 (10Gehel) [13:39:06] 10Data-Platform-SRE, 10Wikidata, 10Wikidata-Query-Service: Create alerts for https://query.wikidata.org/bigdata/ldf - https://phabricator.wikimedia.org/T347355 (10Gehel) Is this related? T352083 [13:49:33] 10Data-Platform-SRE: Test hardware-based performance optimizations for WDQS import - https://phabricator.wikimedia.org/T351662 (10Gehel) p:05Triage→03Low [14:17:42] (SystemdUnitFailed) firing: produce_canary_events.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:19:29] PROBLEM - Check systemd state on an-launcher1002 is CRITICAL: CRITICAL - degraded: The following units failed: produce_canary_events.service https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [14:30:25] RECOVERY - Check systemd state on an-launcher1002 is OK: OK - running: The system is fully operational https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state [14:32:42] (SystemdUnitFailed) resolved: produce_canary_events.service Failed on an-launcher1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:48:08] 10Data-Platform-SRE: Upgrade Spark to a version with long term Iceberg support, and with fixes to support Dumps 2.0 - https://phabricator.wikimedia.org/T338057 (10xcollazo) > Let me know what you think about when we should plan to implement the changes. Both Dumps 2.0 work and Iceberg migrations will benefit fr... [14:51:02] 10Data-Engineering, 10Data-Platform-SRE, 10Product-Analytics, 10Wmfdata-Python, 10Patch-For-Review: Wmfdata should connect to Presto using the analytics-presto CNAME - https://phabricator.wikimedia.org/T345482 (10BTullis) @nshahquinn-wmf - I have made a patch to wmfdata-python here: https://github.com/wi... [14:57:51] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: [Data Platform] Deploy Spark History Service - https://phabricator.wikimedia.org/T330176 (10BTullis) a:05BTullis→03brouberol [14:59:50] 10Data-Engineering, 10Data-Platform-SRE, 10Event-Platform: Upgrade schema hosts to bullseye - https://phabricator.wikimedia.org/T349286 (10BTullis) a:03BTullis [15:02:14] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host aqs2011.codfw.wmnet with OS bullseye [15:15:15] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host aqs2011.codfw.wmnet with OS bullseye executed with errors: - aqs2011 (**FAIL**) - Downtimed on Icinga/... [15:15:31] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host aqs2011.codfw.wmnet with OS bullseye [15:32:20] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-codfw: Q2:rack/setup/install elastic2092-2109 - https://phabricator.wikimedia.org/T349780 (10Jhancock.wm) elastic2096 got a space in the serial number somehow. it has been fixed and the provisioning script took. upgrading firmware. [15:32:37] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-codfw: Q2:rack/setup/install elastic2092-2109 - https://phabricator.wikimedia.org/T349780 (10Jhancock.wm) [15:41:22] 10Data-Platform-SRE, 10SRE: Harden the netboot configuration against typos - https://phabricator.wikimedia.org/T351059 (10brouberol) 05Open→03Resolved [15:42:15] 10Data-Platform-SRE, 10SRE: Harden the netboot configuration against typos - https://phabricator.wikimedia.org/T351059 (10MoritzMuehlenhoff) Great work, really useful and well done! [15:43:57] (03PS9) 10Bearloga: Create analytics/external/wiki_highlights_experiment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/975079 (https://phabricator.wikimedia.org/T348613) (owner: 10Conniecc1) [15:50:48] 10Data-Engineering, 10MediaWiki-Vendor, 10PHP 8.2 support, 10Upstream: Use of "self" in callables is deprecated in php8.2 from liuggio/statsd-php-client package - https://phabricator.wikimedia.org/T326386 (10lbowmaker) @Ahoelzl @JAllemandou - please see comment above. [15:53:37] (03CR) 10Bearloga: "@Sbisson @Conniecc1: I made some changes:" [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/975079 (https://phabricator.wikimedia.org/T348613) (owner: 10Conniecc1) [15:54:52] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host aqs2011.codfw.wmnet with OS bullseye completed: - aqs2011 (**WARN**) - Removed from Puppet and PuppetD... [15:55:13] (03CR) 10Sbisson: [C: 03+1] Create analytics/external/wiki_highlights_experiment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/975079 (https://phabricator.wikimedia.org/T348613) (owner: 10Conniecc1) [15:55:59] 10Data-Platform-SRE, 10Wikidata, 10Wikidata-Query-Service: Create alerts for https://query.wikidata.org/bigdata/ldf - https://phabricator.wikimedia.org/T347355 (10Dzahn) >>! In T347355#9363391, @Gehel wrote: > Is this related? T352083 Yes. I merged them all into T352084. [15:56:17] 10Data-Platform-SRE: ProbeDown - https://phabricator.wikimedia.org/T352083 (10Dzahn) [15:59:26] 10Data-Platform-SRE, 10Discovery-Search (Current work): Investigate performance differences between wdqs2022 and older hosts - https://phabricator.wikimedia.org/T336443 (10bking) [16:04:36] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host aqs2012.codfw.wmnet with OS bullseye [16:12:46] (03CR) 10Conniecc1: [C: 03+2] Create analytics/external/wiki_highlights_experiment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/975079 (https://phabricator.wikimedia.org/T348613) (owner: 10Conniecc1) [16:13:24] (03Merged) 10jenkins-bot: Create analytics/external/wiki_highlights_experiment [schemas/event/secondary] - 10https://gerrit.wikimedia.org/r/975079 (https://phabricator.wikimedia.org/T348613) (owner: 10Conniecc1) [16:13:54] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: Create a keytab for each spark-history-server and add it to the puppet secret hieradata - https://phabricator.wikimedia.org/T351816 (10brouberol) These are the files I'm about to create principals and keytabs with, via `generate_keytabs.py`: ` brouberol@krb1... [16:15:55] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: Create a keytab for each spark-history-server and add it to the puppet secret hieradata - https://phabricator.wikimedia.org/T351816 (10brouberol) ` brouberol@krb1001:~$ sudo generate_keytabs.py --realm WIKIMEDIA spark-history-test.txt spark-history-test/spark... [16:17:43] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host aqs2012.codfw.wmnet with OS bullseye executed with errors: - aqs2012 (**FAIL**) - Downtimed on Icinga/... [16:17:54] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host aqs2012.codfw.wmnet with OS bullseye [16:22:56] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install elastic110[3-7] - https://phabricator.wikimedia.org/T349777 (10VRiley-WMF) [16:24:57] (DiskSpace) firing: Disk space an-web1001:9100:/srv 5.371% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-web1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [16:33:56] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: [Data Platform] Deploy Spark History Service - https://phabricator.wikimedia.org/T330176 (10brouberol) [16:34:05] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: Create a keytab for each spark-history-server and add it to the puppet secret hieradata - https://phabricator.wikimedia.org/T351816 (10brouberol) 05Open→03Resolved The keytabs were rsynced to puppetmaster1001 and committed to `/srv/private/modules/secret/... [16:34:59] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host aqs2012.codfw.wmnet with OS bullseye executed with errors: - aqs2012 (**FAIL**) - Removed from Puppet... [16:35:41] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host aqs2012.codfw.wmnet with OS bullseye [16:44:11] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: [Data Platform] Deploy Spark History Service - https://phabricator.wikimedia.org/T330176 (10brouberol) [16:44:17] 10Data-Engineering (Sprint 5), 10Data-Platform-SRE: Create a keytab for each spark-history-server and add it to the puppet secret hieradata - https://phabricator.wikimedia.org/T351816 (10brouberol) 05Resolved→03Open Actually, I need to re-open, as I still need to add the keytabs to the secrets hieradata [17:14:40] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host aqs2012.codfw.wmnet with OS bullseye completed: - aqs2012 (**PASS**) - Removed from Puppet and PuppetD... [17:24:58] Hi milimetric - would you have any info on the webrequest error we had yesterday night? [17:25:12] bah, no, didn't see [17:25:19] mwarf [17:25:47] many SLAs, and a webrequest error - I'm pinging you cause you're the backup for Will, and IIRC he's sick, no? [17:27:09] he's in today, but has been in meetings [17:27:31] I think I should just consider myself primary this week, my bad, switching mode [17:27:31] milimetric: may I let you handle this with him? [17:27:34] ofc [17:27:39] thanks for the poke [17:27:42] <3 [17:36:39] joal: did you guys decide to set spark.sql.mapKeyDedupPolicy to LAST_WIN? Or have some other idea? [18:02:53] ah, yes: https://gerrit.wikimedia.org/r/c/analytics/refinery/+/977774/2/hql/webrequest/refine_webrequest_hourly.hql [18:03:14] I guess I'll just fast-track and deploy that. If anyone finds anything wrong with that, let me know [18:03:40] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Set spark.sql.mapKeyDedupPolicy to LAST_WIN in refine_webrequest [analytics/refinery] - 10https://gerrit.wikimedia.org/r/977774 (https://phabricator.wikimedia.org/T351909) (owner: 10Mforns) [18:04:31] (03CR) 10Milimetric: [V: 03+2 C: 03+2] Quick fix to refine_webrequest_hourly for exclude_row_ids [analytics/refinery] - 10https://gerrit.wikimedia.org/r/975418 (owner: 10Mforns) [18:23:41] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10Eevans) [18:24:37] 10Data-Platform-SRE, 10Epic: Upgrade the Data Engineering infrastructure to Debian Bullseye - https://phabricator.wikimedia.org/T288804 (10Eevans) [18:25:50] 10Data-Platform-SRE, 10Epic: Upgrade the Data Engineering infrastructure to Debian Bullseye - https://phabricator.wikimedia.org/T288804 (10Eevans) [18:26:04] 10Data-Platform-SRE, 10Cassandra: Upgrade AQS cluster to Bullseye - https://phabricator.wikimedia.org/T347738 (10Eevans) 05Open→03Resolved macro-deployed [18:30:28] !log deployed refinery to hdfs [18:30:29] Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log [18:30:29] 10Data-Engineering, 10Data Products (Data Product Sprint 04): Duplicate keys in x_analytics header corrupt some wmf_raw.webrequest rows and break refinement of wmf.webrequest - https://phabricator.wikimedia.org/T351909 (10Milimetric) merged and deployed right now, used to fix another instance of the webrequest... [18:51:43] 10Quarry: [bug] query/77794: "This query was stopped" - https://phabricator.wikimedia.org/T352211 (10Boshomi_Phabricator) [19:12:47] 10Quarry: [feedback] Allow search within SQL - https://phabricator.wikimedia.org/T352212 (10Boshomi_Phabricator) [20:07:35] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host an-worker1158.eqiad.wmnet with OS bullseye [20:21:58] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host an-worker1161.eqiad.wmnet with OS bullseye [20:24:57] (DiskSpace) firing: Disk space an-web1001:9100:/srv 5.371% free - https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&viewPanel=12&var-server=an-web1001 - https://alerts.wikimedia.org/?q=alertname%3DDiskSpace [20:46:54] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host an-worker1158.eqiad.wmnet with OS bullseye [21:16:40] 10Data-Engineering, 10Release-Engineering-Team, 10GitLab (CI & Job Runners): Unblock Dockerfile syntax to build images with Gitlab trusted runner - https://phabricator.wikimedia.org/T351792 (10thcipriani) > Looks like the root problem is this configuration: > https://gerrit.wikimedia.org/r/plugins/gitiles/op... [21:42:12] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host an-worker1161.eqiad.wmnet with OS bullseye executed with errors: - an-worke... [22:34:31] 10Data-Platform-SRE, 10Patch-For-Review: Simplify query.wikidata.org LDF endpoint config - https://phabricator.wikimedia.org/T352111 (10bking) 05Open→03In progress p:05Triage→03Medium a:03bking [23:15:59] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1001 for host an-worker1158.eqiad.wmnet with OS bullseye completed: - an-worker1158 (**WA... [23:25:45] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host an-worker1159.eqiad.wmnet with OS bullseye [23:38:26] 10Data-Platform-SRE, 10DC-Ops, 10SRE, 10ops-codfw, 10Patch-For-Review: Q2:rack/setup/install elastic2087-2091 - https://phabricator.wikimedia.org/T349778 (10Papaul) [23:42:53] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host an-worker1160.eqiad.wmnet with OS bullseye [23:48:27] 10Data-Engineering, 10DC-Ops, 10SRE, 10ops-eqiad: Q2:rack/setup/install an-worker11[57-75] - https://phabricator.wikimedia.org/T349936 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1001 for host an-worker1161.eqiad.wmnet with OS bullseye