[02:33:39] (SystemdUnitFailed) firing: netbox_ganeti_esams_sync.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:33:39] (SystemdUnitFailed) firing: netbox_ganeti_esams_sync.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:37:15] 10netops, 10Infrastructure-Foundations, 10SRE: cr1-esams listing transport ospf interfaces multiple times - https://phabricator.wikimedia.org/T344546 (10cmooney) p:05Triage→03Medium [08:38:48] 10netops, 10Infrastructure-Foundations, 10SRE: cr1-esams listing transport ospf interfaces multiple times - https://phabricator.wikimedia.org/T344546 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=9daa4455-f5dd-4a7a-8558-107a3e4842a2) set by cmooney@cumin1001 for 2:00:00 on 29 host(s) an... [09:16:35] 10netops, 10Infrastructure-Foundations, 10SRE: cr1-esams listing transport ospf interfaces multiple times - https://phabricator.wikimedia.org/T344546 (10cmooney) 05Open→03Resolved @ayounsi advised to try running a "commit full" on the router rather than a reboot. Apparently this command forces a full co... [09:33:48] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) p:05Triage→03Low [09:38:59] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) [09:40:05] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) [09:42:36] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) In terms of Pybal or Anycast routes etc. we can possibly not send those. If the overall approach seems good we can weigh up exactly what is useful to s... [09:51:30] 10netops, 10Infrastructure-Foundations, 10SRE: Announce internal/core routes from CRs to L3 switches - https://phabricator.wikimedia.org/T344547 (10cmooney) This should also result in more optimal routing as switches will send traffic to the CR with the best path to the remote internal destinations. i.e. if... [10:33:39] (SystemdUnitFailed) firing: netbox_ganeti_esams_sync.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:33:39] (SystemdUnitFailed) firing: netbox_ganeti_esams_sync.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:33:39] (SystemdUnitFailed) firing: netbox_ganeti_esams_sync.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:33:39] (SystemdUnitFailed) firing: netbox_ganeti_esams_sync.service Failed on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed