[07:16:09] jynus: good morning, heads up about https://phabricator.wikimedia.org/T299417#8143953 s5 will probably alert again next week (I haven't dropped the columns in the backup sources yet) [07:16:32] with 21% redaction :D [07:28:18] nice [08:01:38] I am going to restart and hopefully fix my audio microphone [10:20:14] Amir1, jynus: the beta cluster DB just crashed due to I think wmcs wide issues. Is there anything that should be done before someone takes the primary out of read-only (which is automatic on startup)? Chat is in -releng [10:22:28] not sure if Amir1 went there already, but as long as GTID replication is enabled on the replica(s), and the error log is clean, not much [10:23:00] It's beta cluster. It really doesn't matter [10:23:06] that's true [10:23:15] I meant for replication to not come up broken [10:23:32] Amir1: can it just be taken out? [10:23:42] and if it is broken, it should just be recloned [10:24:05] If that happens, ping me. You need to do skip transaction or something. It would cause data corruption but again. Beta cluster [10:31:45] everything looks good and writes flowing! thanks both of you for ensuring i'm sane. [14:25:19] Amir1: you broke s2 sanatarium [14:26:03] s2 on db1155 is CRITICAL: CRITICAL slave_sql_state Slave_SQL_Running: No, Errno: 1062, Errmsg: Could not execute Write_rows_v1 event on table ptwiki.templatelinks: Duplicate entry 6941876-0- for key tl_from, Error_code: 1062: handler error HA_ERR_FOUND_DUPP_KEY: the events master log db1156-bin.001819, end_log_pos 231590299 [14:26:41] I check and fix it [14:27:55] Amir1: np! [14:35:19] https://grafana.wikimedia.org/d/000000303/mysql-replication-lag?viewPanel=2 doesn't show lag which i find weird [14:36:20] or https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&refresh=1m&var-job=All&var-server=db1155&var-port=13312&viewPanel=6 [14:43:10] PROBLEM - MariaDB sustained replica lag on s2 on db1155 is CRITICAL: 58.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1155&var-port=13312 [14:44:10] that's a slow alert [14:44:46] RECOVERY - MariaDB sustained replica lag on s2 on db1155 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1155&var-port=13312