[08:28:36] (SystemdUnitFailed) firing: clean-confd-rundir.service Failed on ganeti3006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:33:36] (SystemdUnitFailed) resolved: clean-confd-rundir.service Failed on ganeti3006:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:03:36] (SystemdUnitFailed) firing: user-runtime-dir@23938.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:09:29] (SystemdUnitFailed) firing: (2) user-runtime-dir@23938.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:10:05] 10SRE-tools, 10Spicerack: gNMI module in Spicerack - https://phabricator.wikimedia.org/T344325 (10ayounsi) [09:10:46] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack: gNMI module in Spicerack - https://phabricator.wikimedia.org/T344325 (10ayounsi) [09:10:52] 10SRE-tools, 10Infrastructure-Foundations: Package pyGNMI and dictdiffer to be used by cookbooks - https://phabricator.wikimedia.org/T340045 (10ayounsi) [09:11:07] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack: gNMI module in Spicerack - https://phabricator.wikimedia.org/T344325 (10ayounsi) [09:11:13] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Add Dell switches support to Homer/Cookbooks - https://phabricator.wikimedia.org/T320638 (10ayounsi) [09:11:29] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack: gNMI module in Spicerack - https://phabricator.wikimedia.org/T344325 (10ayounsi) [09:19:29] (SystemdUnitFailed) firing: (2) user-runtime-dir@23938.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:27:43] 10SRE-tools, 10Spicerack: Junos module in Spicerack - https://phabricator.wikimedia.org/T344326 (10ayounsi) p:05Triage→03Low [09:28:36] (SystemdUnitFailed) firing: (3) user-runtime-dir@23938.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:38:36] (SystemdUnitFailed) resolved: user-runtime-dir@23938.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:51:42] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE: Enable OIDC in CAS - https://phabricator.wikimedia.org/T311999 (10Jelto) [09:52:31] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE, 10collaboration-services, and 4 others: migrate gitlab away from the CAS protocol - https://phabricator.wikimedia.org/T320390 (10Jelto) 05Open→03Resolved Cleanup of puppet code is done and most cas references are removed. I'm not sure how to move forward... [11:15:26] 10SRE-tools, 10Infrastructure-Foundations: Package pyGNMI and dictdiffer to be used by cookbooks - https://phabricator.wikimedia.org/T340045 (10MoritzMuehlenhoff) I've uploaded dictdiffer for Bulleye and Bookworm (since we're likely about to move the Cumin servers to Bookworm in the not too distant future) to... [11:23:42] (SystemdUnitCrashLoop) firing: prometheus-ganeti-exporter.service crashloop on ganeti3008:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [11:53:42] (SystemdUnitCrashLoop) firing: (2) prometheus-ganeti-exporter.service crashloop on ganeti3006:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [12:28:42] (SystemdUnitCrashLoop) resolved: (2) prometheus-ganeti-exporter.service crashloop on ganeti3006:9100 - TODO - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitCrashLoop [13:24:26] 10SRE-tools, 10Infrastructure-Foundations: Add warning when provision cookbook is ran without the virtualization flag on hypervisors - https://phabricator.wikimedia.org/T344342 (10ayounsi) [13:26:14] 10SRE-tools, 10Infrastructure-Foundations: Add warning when provision cookbook is ran without the virtualization flag on hypervisors - https://phabricator.wikimedia.org/T344342 (10cmooney) ganeti* and cloudvirt* for sure it'd make sense to have this for [13:43:36] (SystemdUnitFailed) firing: apache2.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:36:49] jbond: there's a puppet-merge by you which seems stalled? the process is over 30 mins old [14:38:27] moritzm: sorry merged now [14:39:33] ack [14:43:36] (SystemdUnitFailed) resolved: apache2.service Failed on config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed