[01:08:03] PROBLEM - MariaDB sustained replica lag on m1 on db1217 is CRITICAL: 60.8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1217&var-port=13321 [01:08:11] PROBLEM - MariaDB sustained replica lag on m1 on db2132 is CRITICAL: 13.6 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2132&var-port=9104 [01:10:57] RECOVERY - MariaDB sustained replica lag on m1 on db2132 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2132&var-port=9104 [01:12:11] RECOVERY - MariaDB sustained replica lag on m1 on db1217 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1217&var-port=13321 [01:29:40] *14 [06:22:16] jynus: when could I stop m1 eqiad backup source? [06:22:21] it is db1217 [07:22:54] any time [07:23:29] great! thanks! [07:27:32] wow librenms -rw-rw---- 1 mysql mysql 254G Oct 20 07:24 syslog.ibd [07:27:39] * marostegui creates task [07:27:48] yes, I commented this to obs team [07:27:54] oops [07:28:01] jynus: did you create a task? [07:28:16] that it didn't make much sense to store logs on mysql [07:28:28] they told me they would have a look at it [07:28:36] jynus: I will create it then [07:28:45] Otherwise it will be forgotten [07:31:34] https://phabricator.wikimedia.org/T349362 [07:34:54] let me know when you are done with maintenance so I can do some stuff for T349360 [07:34:54] T349360: Clean up dbbackups.backup_files table - https://phabricator.wikimedia.org/T349360 [07:35:20] will do [07:35:31] it will take around 1h i think [07:35:52] np, I may take some time to go do some things outside home [08:41:17] jynus: db1217:m1 is up again [09:50:09] thanks, with your permision I will put them back into maintenance mode as to be able to delete/move/compress rows, causing temporary lag as to speed up the archiving? [09:52:08] sure [10:39:08] hi guys, mostly back today. hope all's well? let me know if something requires my immediate attention. thanks! [11:07:46] PROBLEM - MariaDB sustained replica lag on s3 on db1166 is CRITICAL: 105.8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1166&var-port=9104 [11:08:43] that's not me [11:09:37] it has recovered [11:09:58] there is some high traffic ongoing on s3, unsure if maintenance or something else? [11:10:52] db1166 is dumps [11:11:10] :-( that was almost 4 minutes of lag [11:11:19] 517142834 | wikiadmin2023 | 10.64.16.147:39410 | hrwiki | Query | 0 | Sending data | SELECT /*!40001 SQL_NO_CACHE */ ` [11:11:24] so, yes, dumps running [11:13:20] RECOVERY - MariaDB sustained replica lag on s3 on db1166 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1166&var-port=9104 [12:19:57] the m1 lag is me, it is downtimed and logged, but in case you saw it on orchestrator [14:27:24] m1 should be back healty now