[07:49:40] FIRING: SystemdUnitFailed: pt-heartbeat-wikimedia.service on db2230:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:54:40] RESOLVED: SystemdUnitFailed: pt-heartbeat-wikimedia.service on db2230:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:53:39] PROBLEM - MariaDB sustained replica lag on s1 on db1207 is CRITICAL: 10.2 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1207&var-port=9104 [15:54:39] RECOVERY - MariaDB sustained replica lag on s1 on db1207 is OK: (C)10 ge (W)5 ge 0.4 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1207&var-port=9104 [22:17:19] PROBLEM - MariaDB sustained replica lag on s3 on db2209 is CRITICAL: 44 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2209&var-port=9104 [22:20:17] PROBLEM - MariaDB sustained replica lag on s3 on db2205 is CRITICAL: 23 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2205&var-port=9104 [22:20:19] PROBLEM - MariaDB sustained replica lag on s3 on db2190 is CRITICAL: 13 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2190&var-port=9104 [22:20:37] PROBLEM - MariaDB sustained replica lag on s3 on db2177 is CRITICAL: 10.8 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2177&var-port=9104 [22:21:37] RECOVERY - MariaDB sustained replica lag on s3 on db2177 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2177&var-port=9104 [22:22:17] PROBLEM - MariaDB sustained replica lag on s3 on db2194 is CRITICAL: 33.8 ge 10 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2194&var-port=9104 [22:22:24] sorry [22:22:36] that was my script certainly, aborted it [22:24:17] RECOVERY - MariaDB sustained replica lag on s3 on db2205 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2205&var-port=9104 [22:25:19] RECOVERY - MariaDB sustained replica lag on s3 on db2194 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2194&var-port=9104 [22:26:19] RECOVERY - MariaDB sustained replica lag on s3 on db2190 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2190&var-port=9104 [22:28:19] RECOVERY - MariaDB sustained replica lag on s3 on db2209 is OK: (C)10 ge (W)5 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2209&var-port=9104 [22:30:35] running again with higher sleep