[01:06:42] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db1217 is CRITICAL: 16.6 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1217&var-port=13321
[01:07:38] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db2160 is CRITICAL: 10 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321
[01:11:00] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db2132 is CRITICAL: 16.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2132&var-port=9104
[01:12:02] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db1217 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1217&var-port=13321
[01:12:08] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db2132 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2132&var-port=9104
[01:14:14] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db2160 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321
[05:30:01] <marostegui>	 I just switched over s6
[05:30:07] <marostegui>	 So the master is now running 10.6
[05:33:09] <marostegui>	 Amir1: I just noticed the commit comment here is wrong: https://gerrit.wikimedia.org/r/c/operations/dns/+/953488/ (it says s2, but it should be s6) the change itself is okay, but you might want to double check what is going on. 
[06:49:47] <jynus>	 "There is no switchback: codfw will stay primary for the next ~6 months" interesting
[07:01:23] <Amir1>	 I'll check. Thanks
[07:05:02] <jynus>	 I'm giving a look at m1, sometimes backups can overload the server, but it shouldn't be them this time, as  they start at 2am
[07:09:27] <jynus>	 I think it was either network or etherpad
[07:28:39] <Amir1>	 marostegui: Fixed https://gitlab.wikimedia.org/toolforge-repos/switchmaster/-/commit/d12471ffd045646a93100c5df90b4a838730985a
[07:28:48] <marostegui>	 thanks
[07:33:42] <Amir1>	 btw my schema change is only not done in s6. Can I run it there? https://phabricator.wikimedia.org/T343718
[07:33:44] <Amir1>	 or when
[07:34:11] <marostegui>	 yeah
[07:34:13] <marostegui>	 it can go now
[07:34:39] <Amir1>	 awesome. thanks.
[09:02:18] <arturo>	 hi! there is an alert
[09:02:20] <arturo>	 clouddb1017/MariaDB memory is CRITICAL 
[09:02:26] <arturo>	 CRIT Memory 98% used. Largest process: mysqld (1667042) = 60.2%
[09:02:37] <arturo>	 is this known / tracked / etc ?
[09:03:23] <arturo>	 let me open a phab ticket
[09:06:00] <arturo>	 https://phabricator.wikimedia.org/T345322
[13:28:30] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on s1 on db1132 is CRITICAL: 6051 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1132&var-port=9104
[13:45:44] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on s1 on db1132 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1132&var-port=9104
[15:40:32] <Amir1>	 I gotta love MariaDB, the exact same schema change on two different hosts of the same sections take 2 minutes and 30 minutes respectively. 
[15:40:56] <jynus>	 cache?
[15:41:09] <Amir1>	 My guess is that in one it just changes the DDL, the other one re-builds the db
[15:41:16] <Amir1>	 *the table
[15:57:59] <marostegui>	 that'd be a bug
[15:59:15] <marostegui>	 and it is not the first time that happens, I sent this years ago, and it was between two minors: https://jira.mariadb.org/browse/MDEV-13175