[06:33:30] PROBLEM - MariaDB sustained replica lag on s4 on db1243 is CRITICAL: 73.6 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1243&var-port=9104 [06:37:30] RECOVERY - MariaDB sustained replica lag on s4 on db1243 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1243&var-port=9104 [08:12:07] marostegui: morning! do you have any idea why mysql_upgrade would fail with 'FATAL ERROR: Can't execute 'mariadb-check'' on clouddb2002-dev? [08:12:19] let me see [08:16:58] taavi: fixed, I have no idea how that server is setup up, it was trying to run /usr/local/bin/mysql_upgrade which is not the right one, the right one is /opt/wmf-mariadb106/bin/mysql_upgrade (which I just ran and worked) [08:17:22] So I guess something on that host is wrong with $PATH or something [08:17:35] Anyway, I ran the good one and it worked [08:19:06] ah, thanks! it seems like /usr/local/bin/mysql_upgrade is also set up by the wmf-mariadb106 package, but via the debian alternatives system. [10:05:01] https://twitter.com/mariadb_org/status/1775105130418896918 [15:10:45] something terrible may have happend on db2098 [15:11:08] I restarted the db instances, but both started on the same socket? [15:12:16] after running puppet, my.cnf got deleted and it started well [15:12:20] but that is scary [15:12:53] no worries, as I will decommission those instances soon [15:13:01] but wanted to report this edge case [19:32:02] (SystemdUnitFailed) firing: wmf_auto_restart_prometheus-mysqld-exporter.service on db2202:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [23:32:18] (SystemdUnitFailed) firing: wmf_auto_restart_prometheus-mysqld-exporter.service on db2202:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed