[07:58:12] msbe1068-1071 just went down [07:58:33] i know it's Easter so you're probably not watching so leaving a message here in case it gets lost [08:13:57] They're not on prod, cf T299462 [08:13:57] T299462: Q3:(Need By: TBD) rack/setup/install ms-be10[68-71] - https://phabricator.wikimedia.org/T299462 [08:19:18] Thanks Emperor and it's good Friday so please enjoy your day off [08:19:47] I more didn't want it drowned in icinga whining for 4 days about random stuff [11:18:56] PROBLEM - MariaDB sustained replica lag on m1 on db2078 is CRITICAL: 22 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2078&var-port=13321 [11:19:14] PROBLEM - MariaDB sustained replica lag on m1 on db1117 is CRITICAL: 54.6 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321 [11:23:28] RECOVERY - MariaDB sustained replica lag on m1 on db2078 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2078&var-port=13321 [11:23:44] RECOVERY - MariaDB sustained replica lag on m1 on db1117 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321 [15:19:02] PROBLEM - Check unit status of swift_ring_manager on ms-fe2009 is CRITICAL: CRITICAL: Status of the systemd unit swift_ring_manager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [16:15:18] RECOVERY - Check unit status of swift_ring_manager on ms-fe2009 is OK: OK: Status of the systemd unit swift_ring_manager https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers [19:22:33] that was a DNS resolution error, FWIW