[00:55:05] FIRING: [6x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13314 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [02:00:05] FIRING: [6x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13314 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [03:15:05] FIRING: [4x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13314 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [03:20:05] FIRING: [4x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13314 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [03:25:05] FIRING: [4x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13314 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [03:35:05] RESOLVED: [3x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13314 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [05:20:05] FIRING: MysqlReplicationThreadCountTooLow: MySQL instance db1150:13313 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db1150&var-port=13313 - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [05:30:05] FIRING: [2x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13313 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [06:50:05] FIRING: [2x] MysqlReplicationThreadCountTooLow: MySQL instance db1150:13313 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [06:55:05] RESOLVED: MysqlReplicationThreadCountTooLow: MySQL instance db1150:13313 has replication issues. - https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting#Depooling_a_replica - https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db1150&var-port=13313 - https://alerts.wikimedia.org/?q=alertname%3DMysqlReplicationThreadCountTooLow [07:27:41] ^-- are those known about? [07:27:46] I mean, the cause thereof [07:43:32] there is a known issue on dumps that cause replication to suffer a bit T368098 [07:43:33] T368098: Dumps generation without prefetch cause disruption to the production environment - https://phabricator.wikimedia.org/T368098 [07:47:48] arnaudb: those hosts aren't s1 [07:50:43] I infered that the problem was a bit wider indeed, it was my conclusion after checking that the hosts were in fact being backuped which seemed a legit cause [07:50:53] (and they recovered just a few moments after) [07:51:03] is that part of the new prometheus alert? [07:51:12] because we don't get alerted for those [07:51:20] good question, let me check [07:51:25] maybe they need to be silenced [07:51:36] or threshold to be moved/exclusion to be made for backup hosts [07:51:43] checking [07:53:47] confirmed, will open a task to properly handle this [07:57:46] T372991 [07:57:52] T372991: mariadb - monitoring - MysqlReplicationThreadCountTooLow fix - https://phabricator.wikimedia.org/T372991