[00:16:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:16:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:16:25] RESOLVED: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:28:55] 10netops, 06Infrastructure-Foundations, 10ops-magru: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10499926 (10cmooney) Everything remains stable since the upgrade/reset of the routers yesterday. All protocol adjacencies, interfaces etc look good as are the gene... [10:34:27] 10netops, 06Infrastructure-Foundations, 10ops-magru: Jan 2025 - Magru core router connectivity blips - https://phabricator.wikimedia.org/T384774#10499947 (10Vgutierrez) thanks @cmooney, I'll re-pool the site [11:32:17] I noticed that we have some misconfiguration of puppet7's component sources in apt at least on cumin2002, is that known? [11:32:50] repository_puppet.list and repository_ruby-sys-filesystem.list are both defining it [12:22:22] 10netops, 06Infrastructure-Foundations, 06SRE: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10500289 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=892c37cf-859a-4da6-8f59-c75b5d153219) set by cmooney@cumin1002 for 3:00:00 on 1 host(s) and th... [12:40:42] 10netops, 06Infrastructure-Foundations, 06SRE: Routinator 0.14 causing tempfs file system to fill up - https://phabricator.wikimedia.org/T383116#10500343 (10MoritzMuehlenhoff) 05Open→03Resolved After running 0.14.1 for five days, we can confirm this fixed, disk usage of /var/lib/routinator/repository... [12:47:14] volans: hmmh, that might be caused by the facter update (it needed ruby-sys-filesystem to unbreak some functionality), I'll have a look in a bit [12:51:12] 10netops, 06Infrastructure-Foundations, 06SRE: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10500385 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=7b04d5bf-ab80-4626-96ba-3c376dfc52c2) set by cmooney@cumin1002 for 3:00:00 on 1 host(s) and th... [14:14:23] ack, thanks, lmk if I can help [14:39:50] FYI, I'll temporarily switch aux-k8s-etcd2003 to DRBD to shuffle it to a new virt node [14:40:44] k [15:14:12] and it's back to non-DRBD storage [16:05:20] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10501179 (10cmooney) I'm very happy to say Karim Radhouani, one of the gnmic devs, has been extremely helpful in response to the github issue I poste... [17:05:56] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Productionize gnmic network telemetry pipeline - https://phabricator.wikimedia.org/T369384#10501393 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=e5ab529a-1fb4-461d-b85a-a2d5a66a020a) set by cmooney@cumin1002 for 1:00:... [23:52:13] 10netops, 06Infrastructure-Foundations, 10observability, 06SRE: LibreNMS reporting no routes learnt from doh/durum Anycast peers at various POPs - https://phabricator.wikimedia.org/T384258#10502663 (10cmooney) I was able to run a manual poller command with the updated 'lmns' command and it shows errors pro...