[04:57:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:16:48] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510 (10Papaul) 03NEW [06:43:12] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511 (10Papaul) 03NEW [06:43:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: switch refresh - https://phabricator.wikimedia.org/T408510#11317386 (10Papaul) p:05Triage→03Medium [06:43:54] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11317387 (10Papaul) p:05Triage→03Medium [07:56:00] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11317482 (10cmooney) @papaul looks good! Nothing jumping out at me as problematic in terms of the connectivity plan. I don't think it makes sense to use 40G tho... [08:57:40] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:14:25] 10netops, 06Infrastructure-Foundations, 07sre-alert-triage: Alert in need of triage: PeeringBGPDown (instance cr3-eqsin:9804) - https://phabricator.wikimedia.org/T407833#11318022 (10cmooney) 05Open→03Resolved I removed these additional sessions last week but got distracted and didn't come back to edi... [10:57:25] RESOLVED: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:39:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11318700 (10Papaul) @cmooney thanks for the feedback, I will upgrade the diagram to match the 100G links between the core routers and the switches and the type of... [14:09:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [14:30:58] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 06SRE: Nokia OSPF alerts not working - https://phabricator.wikimedia.org/T408378#11318918 (10tappof) I saw the alerts on the ALERTS metric: https://w.wiki/FqSi . I think there was a silence rule in place, so you didn't get any notifications.... [14:46:59] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 06SRE: Nokia OSPF alerts not working - https://phabricator.wikimedia.org/T408378#11319051 (10cmooney) >>! In T408378#11318918, @tappof wrote: > I saw the alerts on the ALERTS metric: https://w.wiki/FqSi . Ok thanks for that! That is a good... [14:54:23] 10CAS-SSO, 10Gerrit, 06Infrastructure-Foundations: Use IDP for authentication in Gerrit - https://phabricator.wikimedia.org/T147864#11319099 (10hashar) 05Open→03Stalled [16:06:52] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-magru, and 2 others: MAGRU power maint - CHG0262056 - October 29-30, 2025 - https://phabricator.wikimedia.org/T408589#11319592 (10RobH) @netops & #traffic: I don't expect any impact from this according to the notification but just FYI! [17:04:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [17:25:47] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11320304 (10RobH) [[ https://docs.google.com/spreadsheets/d/13ow4JxrsQdz8KSsdBBNwvlrAuGKo8OHWcnR4RhXTYc0/edit?usp=sharing | Google Sheet listing of all affect... [17:38:27] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11320374 (10RobH) [17:38:46] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-eqiad, 06SRE: eqiad: rows C/D Upgrade Tracking - https://phabricator.wikimedia.org/T404609#11320386 (10RobH) [20:08:23] o/, there's a sad person in #wikimedia-tech reporting some kind of VRT email bomb issue: "VRTS is getting a lot of bounces, perhaps self induced". They'll create a ticket at some point, but were wondering if it might be a good idea for somebody to look at the mail server before it gets worse, so I'm typing in here because not everyone here is in that channel. [20:09:28] (unfortunately I wasn't able to get more detail out of them, hopefully that'll come in the eventual ticket...) [20:10:52] oh okay, that actually looks bad, https://phabricator.wikimedia.org/T408632 [20:11:15] cc jhathaway ^ [20:11:53] thanks Raine [20:43:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [22:45:41] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [22:51:17] that netbox alert is ... RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [23:45:41] RESOLVED: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [23:48:57] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO:Switch refresh diagram - https://phabricator.wikimedia.org/T408511#11321730 (10Papaul)