[01:13:10] (SystemdUnitFailed) firing: generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:02:38] (SystemdUnitFailed) firing: (2) httpbb_kubernetes_mw-api-int_hourly.service Failed on cumin1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:02:37] (SystemdUnitFailed) firing: (2) httpbb_kubernetes_mw-api-int_hourly.service Failed on cumin1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:03:10] (SystemdUnitFailed) firing: generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:07:47] 10SRE-tools, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, and 2 others: Create Spicerack cookbook to drain/reboot/uncordon a Kubernetes worker - https://phabricator.wikimedia.org/T212866 (10JMeybohm) [09:07:55] 10SRE-tools, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE, and 4 others: Create a cookbook to perform a rolling reboot of a kubernetes cluster - https://phabricator.wikimedia.org/T260661 (10JMeybohm) [11:03:44] (SystemdUnitFailed) firing: generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:17:53] 10CAS-SSO, 10Data-Platform-SRE, 10Infrastructure-Foundations: Switch DataHub authentication to OIDC - https://phabricator.wikimedia.org/T305874 (10Stevemunene) 05Open→03Resolved Closing this task as resolves as we did meet our acceptance criteria and tracking the login/logout user experience on T347149... [11:18:03] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE: Enable OIDC in CAS - https://phabricator.wikimedia.org/T311999 (10Stevemunene) [11:18:09] 10CAS-SSO, 10Infrastructure-Foundations, 10SRE: Upgrade IDPs to CAS 6.6/Bullseye and enable webauthn - https://phabricator.wikimedia.org/T305518 (10Stevemunene) [13:25:32] moritzm: fyi https://github.com/NLnetLabs/routinator/issues/880#issuecomment-1731403550 [13:31:13] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) 05In progress→03Resolved [15:03:44] (SystemdUnitFailed) firing: generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:18:44] (SystemdUnitFailed) firing: (6) generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:23:48] (SystemdUnitFailed) firing: (6) generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:48:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) p:05Triage→03Medium [19:24:06] (SystemdUnitFailed) firing: generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:28:44] (SystemdUnitFailed) firing: generate_os_reports.service Failed on cumin2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed