[02:57:01] PROBLEM - MariaDB sustained replica lag on s4 on db1248 is CRITICAL: 15.2 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1248&var-port=9104 [03:00:01] RECOVERY - MariaDB sustained replica lag on s4 on db1248 is OK: (C)10 ge (W)5 ge 0.6 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1248&var-port=9104 [05:02:33] I am going to switch s5 codfw mater [05:02:34] master [07:19:24] claime: short answer is "I don't know", sorry; that cookbook was work originally started by j.bond IIRC; I think the theory was that mucking around with the disks under a mounted filesystem might upset something [07:21:27] FIRING: SystemdUnitFailed: swift_rclone_sync.service on ms-be1069:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:22:41] the moss systemd failure is a disk gone in moss-be2002, I'll try and get to that today [07:26:27] RESOLVED: SystemdUnitFailed: swift_rclone_sync.service on ms-be1069:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:46:39] * volans here [08:48:29] ciao volans [08:49:18] :) [09:02:55] I'll be triggering an upgrade backport (idempotent and tested) on some of your hosts jynus ( T367278 ) [09:02:55] T367278: Migrate mysql icinga alerts to alert manager - pt-heartbeat + scaffolding - https://phabricator.wikimedia.org/T367278 [09:04:04] (if you see no objection to it, ofc) [09:04:52] Not mine for the next 2 months [09:05:06] ack :) [09:05:59] loool [11:36:05] * volans lunch [15:36:19] PROBLEM - MariaDB sustained replica lag on s4 on db1248 is CRITICAL: 13.6 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1248&var-port=9104 [15:37:19] RECOVERY - MariaDB sustained replica lag on s4 on db1248 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1248&var-port=9104 [16:01:43] sorry, forgot to mention in team meeting - this Friday, I'm in London (UK), meeting Maryana and other staff and volunteers