[03:40:49] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10tstarling) I'm not sure of my exact motivation for using the primary key in 53d267b3dce25f7b9c7b4216... [04:49:06] 10DBA, 10Patch-For-Review: Upgrade s5 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283235 (10Marostegui) db1100 has caught up - going to start repooling [05:14:36] 10DBA, 10Patch-For-Review: Upgrade s5 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283235 (10Marostegui) [05:15:08] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Marostegui) >>! In T282761#7170212, @Krinkle wrote: > > I laid out three broad ideas for how to mov... [09:39:12] just a heads-up: i'm going to start pt-heartbeat on a (codfw) node that it shouldn't be running on, just to test that this channel does get an alert for it [09:39:56] +1 [09:41:16] victim will be db2080 [09:44:06] well that was unexpected [09:44:42] (alert went to #-operations instead) [09:56:30] ok, i'll try that again [09:58:20] PROBLEM - pt-heartbeat-wikimedia service on db2080 is CRITICAL: CRITICAL - Expecting inactive but unit pt-heartbeat-wikimedia is active https://wikitech.wikimedia.org/wiki/MariaDB/pt-heartbeat [09:58:52] \o/ [09:59:25] RECOVERY - pt-heartbeat-wikimedia service on db2080 is OK: OK - pt-heartbeat-wikimedia is inactive https://wikitech.wikimedia.org/wiki/MariaDB/pt-heartbeat [09:59:26] running puppet agent on the node which should stop pt-hb again. then i'll clean up the heartbeat table. [09:59:29] score. [09:59:55] hb table cleaned up, all is good. [10:49:19] 10DBA, 10Patch-For-Review: Upgrade s5 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283235 (10Marostegui) This is all done, pending the backup clean up which has been scheduled for Monday by Jaime [10:59:06] Reminder, no more maintenance in eqiad since...now :) [10:59:16] (and codfw of course) [11:05:58] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review: Schema change for renaming several indexes in logging table - https://phabricator.wikimedia.org/T270620 (10Ladsgroup) >>! In T270620#7165268, @gerritbot wrote: > Change 700515 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup): > %%%[medi... [12:37:01] 10DBA: Move sanitarium masters to dedicated puppet role - https://phabricator.wikimedia.org/T285390 (10Kormat) [12:40:21] 10DBA: Move sanitarium masters to dedicated puppet role - https://phabricator.wikimedia.org/T285390 (10Kormat) E.g. https://gerrit.wikimedia.org/r/c/operations/puppet/+/700928 is something that could be autogenerated if we could identify sanitarium masters at a puppet level. [12:59:19] 10DBA: Move sanitarium masters to dedicated puppet role - https://phabricator.wikimedia.org/T285390 (10Kormat) p:05Triage→03Medium [13:03:15] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> eqiad DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [13:07:20] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> eqiad DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [13:12:37] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> eqiad DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [13:26:35] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> eqiad DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) GTID checked across codfw and eqiad. Only missing db2095:3312 (sanitarium host). I have enabled it there. [13:26:52] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> eqiad DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [13:29:52] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> eqiad DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [13:31:50] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [13:43:57] 10Data-Persistence-Backup, 10serviceops, 10GitLab (Initialization), 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10wkandek) [14:50:04] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10LSobanski) Notes from the meeting today with @kormat and @Krinkle * Code changes to be made: ** Rec... [18:36:37] PROBLEM - MariaDB sustained replica lag on db2133 is CRITICAL: 1.813e+04 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2133&var-port=9104 [18:59:01] RECOVERY - MariaDB sustained replica lag on db2133 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2133&var-port=9104 [20:04:08] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Marostegui) For what is worth, I have tested the following changes on all eqiad pc hosts for a few h... [20:42:01] PROBLEM - MariaDB sustained replica lag on db2133 is CRITICAL: 493 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2133&var-port=9104 [20:43:53] RECOVERY - MariaDB sustained replica lag on db2133 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2133&var-port=9104 [23:32:34] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10tstarling) >>! In T282761#7172556, @LSobanski wrote: > ** Running on multiple servers (adding a --se... [23:48:19] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Krinkle) >>! In T282761#7173997, @tstarling wrote: >>>! In T282761#7172556, @LSobanski wrote: >> **...