[04:55:25] FIRING: SystemdUnitFailed: wmf_auto_restart_prometheus-nginx-exporter.service on urldownloader1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:10:25] FIRING: [2x] SystemdUnitFailed: dump_proxy_ranges.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:30:25] FIRING: [3x] SystemdUnitFailed: dump_proxy_ranges.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:48:29] 10netops, 06Infrastructure-Foundations: eqiad: pod EF switches upgrade (2026) - https://phabricator.wikimedia.org/T422107 (10ayounsi) 03NEW p:05Triage→03Low [07:48:52] 10netops, 06Infrastructure-Foundations: eqiad: pod EF switches upgrade (2026) - https://phabricator.wikimedia.org/T422107#11780960 (10ayounsi) [07:48:54] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 06SRE, 07Sustainability (Incident Followup): ssw1-f1-eqiad: Fan Spinning Upgraded - https://phabricator.wikimedia.org/T400783#11780961 (10ayounsi) [07:49:11] 10netops, 06Infrastructure-Foundations: eqiad: pod EF switches upgrade (2026) - https://phabricator.wikimedia.org/T422107#11780962 (10ayounsi) [07:49:12] 10netops, 06Infrastructure-Foundations, 06Traffic: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11780964 (10ayounsi) [07:49:38] 10netops, 06Infrastructure-Foundations, 06Traffic: 2026 Junos upgrade - https://phabricator.wikimedia.org/T416444#11780970 (10ayounsi) [07:49:40] 10netops, 06Infrastructure-Foundations, 06Traffic: Upgrade End Of Support Junos - https://phabricator.wikimedia.org/T390813#11780971 (10ayounsi) [08:06:21] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11781010 (10ayounsi) a:03ayounsi Scheduling this for April 7th at 12:00 UTC - 2h Pinging @ssingh (#traffic) for visibility. [08:40:36] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11781122 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=7822bc9e-76f4-4f51-943c-ae5d8f2f7739) set by ayounsi@cumin1003 for 0:30:00 on 4 host(s) and their services wi... [08:53:03] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11781139 (10ayounsi) [09:08:58] 10netops, 06Infrastructure-Foundations, 06Traffic: Upgrade End Of Support Junos - https://phabricator.wikimedia.org/T390813#11781179 (10ayounsi) [09:21:53] 10netops, 06Infrastructure-Foundations: codfw: upgrade routers (2026) - https://phabricator.wikimedia.org/T417871#11781218 (10ayounsi) a:03Papaul Now that we did the switchover, we could focus more on that upgrade. @papaul let me know if you're ok to take care of it. [09:25:55] 10netbox, 10netops, 10DNS, 06Infrastructure-Foundations: Missing includes in DNS repo from Netbox-generated snippets - https://phabricator.wikimedia.org/T422115 (10Volans) 03NEW [09:27:27] 10netbox, 10netops, 10DNS, 06Infrastructure-Foundations, 13Patch-For-Review: Missing includes in DNS repo from Netbox-generated snippets - https://phabricator.wikimedia.org/T422115#11781242 (10Volans) [09:45:33] 10netops, 06Infrastructure-Foundations: eqiad: upgrade routers (2026) - https://phabricator.wikimedia.org/T417873#11781311 (10cmooney) a:03cmooney [10:24:44] 10SRE-tools, 06Infrastructure-Foundations, 06ServiceOps new, 06SRE, and 2 others: Support locking cookbooks run except for switchover related cookbooks - https://phabricator.wikimedia.org/T330997#11781519 (10Volans) Given this has been moved to the backlog I'll leave here a comment for our future selves: i... [10:30:40] FIRING: [3x] SystemdUnitFailed: dump_proxy_ranges.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:05:25] FIRING: [3x] SystemdUnitFailed: dump_proxy_ranges.service on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:25:28] 10netops, 06Infrastructure-Foundations: Create public vlan on eqiad and codfw pods E/F - https://phabricator.wikimedia.org/T422043#11782147 (10ayounsi) [12:27:44] 10netbox, 10netops, 10DNS, 06Infrastructure-Foundations, and 3 others: Missing includes in DNS repo from Netbox-generated snippets - https://phabricator.wikimedia.org/T422115#11782158 (10Volans) p:05Triage→03Medium I've merged and release the fix, do you want to keep the task open to implement some for... [12:37:05] 10netops, 06Infrastructure-Foundations: Create public vlan on eqiad and codfw pods E/F - https://phabricator.wikimedia.org/T422043#11782177 (10ayounsi) My initial thought was to start with E/F only but you're right better plan it fully here, especially the IP allocations. [12:41:42] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11782185 (10ayounsi) [13:23:50] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad: Standardize management routers interfaces - https://phabricator.wikimedia.org/T421674#11782358 (10Jclark-ctr) [14:36:28] 10SRE-tools, 10Cumin, 06Infrastructure-Foundations: Add proxy support to cumin openstack backend - https://phabricator.wikimedia.org/T420360#11782751 (10Volans) 05Open→03Resolved The cloudcumin hosts are now using the webproxies to connect to the openstack APIs and the firewall rule has been reverted... [14:56:48] 10netops, 06Infrastructure-Foundations: mr1-eqiad: move from OSPF to BGP - https://phabricator.wikimedia.org/T421238#11782844 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=65cfdda7-c7c9-47d4-b073-5892d3f0a271) set by pt1979@cumin2002 for 1:00:00 on 2 host(s) and their services with reason... [15:27:59] 10netops, 06Infrastructure-Foundations, 13Patch-For-Review: mr1-eqiad: move from OSPF to BGP - https://phabricator.wikimedia.org/T421238#11783004 (10Papaul) 05Open→03Resolved BGP is up and OSPF removed [15:57:15] 10netops, 06Infrastructure-Foundations: Create public vlans in eqiad and codfw - https://phabricator.wikimedia.org/T422043#11783127 (10cmooney) Is it maybe an idea to re-use some of the existing vlans? Like if we assign rack A1 as the public rack for the A/B POD we could add all the hosts to //public1-a-eqiad... [16:05:40] FIRING: [2x] SystemdUnitFailed: wmf_auto_restart_prometheus-nginx-exporter.service on urldownloader1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:48:51] 10netops, 06Infrastructure-Foundations: esams: upgrade routers & switches (2026) - https://phabricator.wikimedia.org/T416450#11783830 (10ssingh) >>! In T416450#11781010, @ayounsi wrote: > Scheduling this for April 7th at 12:00 UTC - 2h > Pinging @ssingh (#traffic) for visibility. > > And doing mr1-esams now t... [17:50:11] 10CAS-SSO, 06Infrastructure-Foundations: CAS navbar overflows on iOS Safari - https://phabricator.wikimedia.org/T422203 (10Sportzpikachu) 03NEW [17:53:04] 10CAS-SSO, 06Infrastructure-Foundations: CAS navbar overflows on iOS Safari - https://phabricator.wikimedia.org/T422203#11783871 (10Sportzpikachu) [17:54:07] 10netbox, 10netops, 10DNS, 06Infrastructure-Foundations, and 2 others: Missing includes in DNS repo from Netbox-generated snippets - https://phabricator.wikimedia.org/T422115#11783873 (10ssingh) Thanks for fixing it but I agree that we need an alert for this otherwise we will miss this again. [18:03:08] 10CAS-SSO, 06Infrastructure-Foundations: CAS navbar overflows on iOS Safari - https://phabricator.wikimedia.org/T422203#11783896 (10Sportzpikachu) (sort of) reproducible with Firefox dev tools: {F74843135} Settings: iPhone 16e (319x844, DPR: 3), UA as above. [18:12:59] 10CAS-SSO, 06Infrastructure-Foundations: CAS login page overflows on iOS Safari (iPhone 16e) - https://phabricator.wikimedia.org/T422203#11783917 (10Sportzpikachu) [20:05:40] FIRING: [2x] SystemdUnitFailed: wmf_auto_restart_prometheus-nginx-exporter.service on urldownloader1004:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed