[06:32:14] everything should be back to normal on T359919 [06:32:14] T359919: Switchover x1 master (db2115 -> db2196) - https://phabricator.wikimedia.org/T359919 [06:33:52] arnaudb: are you sure? Orchestrator isn't saying the same [06:34:03] There are multiple hosts lagging behind [06:35:58] I just fixed db2191 [06:36:29] I will fix the other two [06:36:50] oh [06:37:00] what was the issue marostegui ? [06:37:14] semi sync [06:40:58] hm marostegui I seem to have a syntax error on dbctl [06:41:03] https://www.irccloud.com/pastebin/Ke7OlD97/ [06:41:33] let me see [06:41:59] what is the change you are attempting to do? [06:42:39] i had the alert on operations so I guess it was the last one on the switchover that did not go well [06:43:13] yes, we might have a split brain [06:43:17] urgh [06:43:20] sorry :-((( [06:43:56] Ok I think I know what the issue is [06:44:07] Please review your backlog and check if dbctl gave errors [06:44:13] I will try to fix this meanwhile [06:44:59] first error poped at `sudo dbctl config commit -m "Depool db2115 T359919"` [06:44:59] T359919: Switchover x1 master (db2115 -> db2196) - https://phabricator.wikimedia.org/T359919 [06:45:23] I might have missed it while updating checklist [06:46:13] Should be fixed now [06:46:17] <3 [06:46:31] if it was not breakfast, I'd offer you a beer on my tab [06:46:47] Please review what went wrong, cause I don't think it was dbctl [06:46:52] I think some of the steps were missed [06:47:01] Cause the old master was still acting as a master [06:47:14] will check [06:47:25] The old master db2115 is now depooled [06:47:34] I'd suggest you update mariadb and start repooling it [06:47:44] I'll check if the host needs reimage [06:47:47] But make sure you review your log to check what step was missed [06:47:49] yep [06:48:18] We were lucky the old master was on RO, otherwise we'd have had a split brain for real [09:26:17] PROBLEM - MariaDB sustained replica lag on s7 on db2122 is CRITICAL: 4.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2122&var-port=9104 [09:27:17] RECOVERY - MariaDB sustained replica lag on s7 on db2122 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2122&var-port=9104