[16:50:31] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10cmooney) [16:55:25] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate hosts from codfw row A/B ASW to new LSW devices - https://phabricator.wikimedia.org/T355544 (10cmooney) [18:03:33] 10netops, 10Data-Persistence, 10Data-Persistence-Backup, 10Infrastructure-Foundations, and 3 others: Migrate servers in codfw rack B4 from asw-b4-codfw to lsw1-b4-codfw - https://phabricator.wikimedia.org/T355860 (10Jhancock.wm) rack is physically prepped for tomorrow. [18:52:35] 10netops, 10Ganeti, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Investigate Ganeti in routed mode - https://phabricator.wikimedia.org/T300152 (10bking) @ayounsi Apologies for the trouble, I didn't realize `sretest2005` was in active use. Unfortunately, I reimaged it while I was working on T3... [19:15:19] 10Traffic, 10Patch-For-Review: sre.dns.roll-restart-reboot-wikimedia-dns cookbook sometimes cannot remove downtime - https://phabricator.wikimedia.org/T353779 (10BCornwall) 05In progress→03Resolved [19:46:25] (SystemdUnitFailed) firing: anycast-healthchecker.service Failed on durum2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:47:07] ^ expected, though I think we should not alert on these ideally [19:47:11] will check [19:51:25] (SystemdUnitFailed) resolved: (2) anycast-healthchecker.service Failed on durum2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:11:55] (SystemdUnitFailed) firing: (7) anycast-healthchecker.service Failed on durum1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:12:06] all good [20:16:40] (SystemdUnitFailed) resolved: (6) anycast-healthchecker.service Failed on durum1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:26:55] (SystemdUnitFailed) firing: (7) anycast-healthchecker.service Failed on durum3003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:31:40] (SystemdUnitFailed) resolved: (6) anycast-healthchecker.service Failed on durum3003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:54:13] 10Traffic, 10Infrastructure-Foundations, 10vm-requests: eqiad: 1 VM request for ncmonitor - https://phabricator.wikimedia.org/T356710 (10BCornwall) 05Open→03In progress a:03BCornwall [21:54:32] 10Traffic, 10Infrastructure-Foundations, 10vm-requests: eqiad: 1 VM request for ncmonitor - https://phabricator.wikimedia.org/T356710 (10BCornwall) p:05Triage→03Low [22:01:18] 10Traffic, 10Infrastructure-Foundations, 10vm-requests: eqiad: 1 VM request for ncmonitor - https://phabricator.wikimedia.org/T356710 (10BCornwall)