[01:10:35] PROBLEM - MariaDB sustained replica lag on m1 on db1117 is CRITICAL: 21.8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321 [01:12:21] RECOVERY - MariaDB sustained replica lag on m1 on db1117 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321 [08:07:32] (MysqlReplicationLag) firing: MySQL instance db2181:9104 has too large replication lag (6d 8h 37m 34s) - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db2181&var-port=9104 - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationLag [08:11:55] ^ me [08:59:26] Can I get a review of https://gerrit.wikimedia.org/r/c/operations/puppet/+/887727/ during the day? [08:59:51] what's the change with the other one? [09:00:02] Nah, noticiations on both hosts [09:00:40] The important thing is basically db1159 with master role and the dbproxy ips [09:31:47] Re:Xionox's message, I had a look and I saw some packets dropped on some db hosts- confd, mediawiki cross dc, and killed connections (the last ones are unavoidable, but the first 2 may need more reasearch) [12:07:32] (MysqlReplicationLag) firing: MySQL instance db2181:9104 has too large replication lag (4d 22h 6m 59s) - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db2181&var-port=9104 - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationLag [12:10:05] ^ known [14:51:59] jynus: is there a ticket for it? It seems 🐟 [14:54:38] I think is the same issue being discussed in other channels [14:54:49] although there could be some other stuff too [16:07:32] (MysqlReplicationLag) firing: MySQL instance db2181:9104 has too large replication lag (3d 11h 25m 41s) - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db2181&var-port=9104 - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationLag [20:07:32] (MysqlReplicationLag) firing: MySQL instance db2181:9104 has too large replication lag (2d 4h 20m 48s) - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db2181&var-port=9104 - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationLag