[02:34:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:34:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:28:52] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Provide an official Docker image for CAS-SSO - https://phabricator.wikimedia.org/T412826#11540415 (10SLyngshede-WMF) Yes and no. We need to change the base image to the WMF Java image, which is only available as AMD64, which means that it's pretty... [10:34:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:51:01] 10Mail, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE: Sendmail network error (deployment) - https://phabricator.wikimedia.org/T407723#11541239 (10Blake) [14:34:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:40:30] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11541938 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [14:40:45] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11541963 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [14:41:04] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11541988 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [15:07:53] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11542145 (10ops-monitoring-bot) Host dse-k8s-worker1008.eqiad.wmnet rebooted by btullis@cumin1003 with... [15:18:18] 10SRE-tools, 10DNS, 06Infrastructure-Foundations, 06Traffic, 13Patch-Needs-Improvement: DNS repo: add Jenkins job to ensure there are no duplicates - https://phabricator.wikimedia.org/T155761#11542183 (10BCornwall) 05Stalled→03Resolved a:03BCornwall I believe this has been solved with the lates... [15:26:34] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11542216 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [17:22:20] FIRING: [2x] PfwCoreBGPDown: Fundraising Firewall core BGP session down between pfw1-codfw and (null) (10.195.0.248) - group VPN - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DPfwCoreBGPDown [17:47:11] RESOLVED: [2x] PfwCoreBGPDown: Fundraising Firewall core BGP session down between pfw1-codfw and (null) (10.195.0.248) - group VPN - https://wikitech.wikimedia.org/wiki/Network_monitoring#BGP_status - https://alerts.wikimedia.org/?q=alertname%3DPfwCoreBGPDown [18:34:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:54:28] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review: Support listing pooled / active authdns hosts (rather than all) - https://phabricator.wikimedia.org/T375014#11543186 (10Scott_French) [22:34:40] FIRING: SystemdUnitFailed: dump_cloud_ip_ranges.service on puppetserver2004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed