[09:00:37] PROBLEM - MariaDB sustained replica lag on s6 on db1213 is CRITICAL: 4.8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1213&var-port=13316 [09:05:23] PROBLEM - MariaDB sustained replica lag on s6 on db2169 is CRITICAL: 2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2169&var-port=13316 [09:06:53] RECOVERY - MariaDB sustained replica lag on s6 on db2169 is OK: (C)2 ge (W)1 ge 0.8 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2169&var-port=13316 [09:08:01] RECOVERY - MariaDB sustained replica lag on s6 on db1213 is OK: (C)2 ge (W)1 ge 0.6 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1213&var-port=13316 [11:35:30] Another oomkill on 2028 [11:36:40] Probably worth mentioning it on https://phabricator.wikimedia.org/T353456 [11:36:48] Yes, updated [11:36:51] <3 [11:37:05] I'll restart the cassandra isntance [14:22:11] For what it's worth: Puppet will restart, and provided it does, that period of downtime is "acceptable" [14:22:49] I mean, so long as it continues to be the case that it's days or so between events [15:12:27] ack, thanks