[04:32:34] 10DBA, 10Patch-For-Review, 10Platform Team Initiatives (Revision Storage Schema Improvements), 10Schema-change, 10User-Ladsgroup: Drop index rev_page_id (rev_page, rev_id) - https://phabricator.wikimedia.org/T163532 (10Marostegui) Removing the #DBA tag as we have the above ticket created by @Ladsgroup [04:39:15] 10Blocked-on-schema-change, 10DBA: Schema change for dropping rev_page_id index - https://phabricator.wikimedia.org/T285149 (10Marostegui) 05Open→03Stalled p:05Triage→03Medium See: T163532#7164522 [04:45:13] 10Blocked-on-schema-change, 10DBA: Schema change for renaming several indexes in logging table - https://phabricator.wikimedia.org/T270620 (10Marostegui) @Ladsgroup I have seen this query: ` SELECT /* IndexPager::buildQueryInfo (LogPager) */ log_id, log_type, log_action, log_timestamp, log_namespace, log_title... [04:47:39] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [04:50:13] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [04:56:12] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [04:59:21] 10DBA, 10Patch-For-Review: Switchover s5 from db1100 to db1130 - https://phabricator.wikimedia.org/T284529 (10Marostegui) [05:04:37] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [05:06:21] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [05:08:11] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [05:11:19] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [05:12:09] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [05:12:13] 10DBA, 10Patch-For-Review: Switchover s5 from db1100 to db1130 - https://phabricator.wikimedia.org/T284529 (10Marostegui) [05:14:14] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [05:16:55] marostegui: https://phabricator.wikimedia.org/T277123 doesn't show the rest of s3/eqiad as being done. is that correct? [05:17:10] yeah, those are waitnig for the dc switch [05:17:21] Copy and paste mistake :) [05:17:36] in that case, we probably don't want to apply this to the candidate master just yet, correct? [05:17:37] we should also update the kernel upgrade task, let me search it [05:17:50] https://phabricator.wikimedia.org/T268392 https://phabricator.wikimedia.org/T266486 https://phabricator.wikimedia.org/T273360 https://phabricator.wikimedia.org/T276150 [05:17:57] jynus: it will be done with the reimage [05:18:09] ah, true, technically not done yet [05:19:26] aside of a dip in writes 20 min ago, traffic seems allright: https://grafana.wikimedia.org/d/000000278/mysql-aggregated?orgId=1&var-site=eqiad&var-group=core&var-shard=s3&var-role=All&from=1624231139023&to=1624252739023 [05:20:27] marostegui: db1123 is depooled, i'm going to stop replication on it too, so it can't corrupt itself from 10.4 [05:20:53] kormat: cool, do you want me to apply the schema changes? [05:21:00] or do you want to do the reimage first? [05:21:08] buffer pool seems a bit everywhere, but hopefully gets better with time: https://grafana.wikimedia.org/d/000000273/mysql?viewPanel=13&orgId=1&var-server=db1157&var-port=9104&from=1624242063106&to=1624252863106 [05:21:16] marostegui: a girl can't refuse an offer like that [05:21:21] XD [05:21:33] ok, I will start applying the schema changes, it might take a few hours to complete [05:21:39] marostegui: i'll disable notifications for the machine [05:21:51] kormat: can you downtime it for 1d too? [05:21:59] sure thing [05:22:33] or, how about 2h? because that's what i just did 🥀 [05:22:47] sure, if notifications are disabled, that's ok [05:24:19] will merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/692845 ok? [05:24:29] jynus: SGTM [05:25:16] I won't kill db1171 (because it is in use, and just in case) immediatelly, but I will schedule its destruction for next week [05:25:17] kormat: Proceeding with db1123 then [05:25:28] marostegui: 👍 [05:25:48] jynus: bear in mind it will not be getting any further replicated changes from db1123 [05:25:58] it is stopped now [05:26:02] so no issue [05:26:20] but we want it to finish current backup [05:29:38] I will also run a test backup on the new 10.4 instance [05:31:45] db1102 should stop replication now, that would be me with the test backup [05:31:53] db1102:s3, I mean [05:32:36] 10DBA, 10Patch-For-Review: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 (10Kormat) [05:33:10] 10DBA, 10Patch-For-Review: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 (10jcrespo) [05:33:27] ups, our changes may had conflicted ^ kormat [05:33:46] looks ok on refresh [05:33:50] lol we did the same change ! [05:34:06] but phab is unlike the wikis, last one wins normally [05:46:04] just turn phabricator into a mediawiki extension to solve all of our problems [05:46:19] for some definition of "solve" :) [05:47:19] i did not say it would not introduce any new problems [05:47:40] hehe [06:02:43] s3 now available in orchestrator: https://orchestrator.wikimedia.org/web/cluster/alias/s3 [06:03:24] 10DBA, 10Orchestrator, 10User-Kormat: Enable report_host for mariadb - https://phabricator.wikimedia.org/T266483 (10Kormat) [06:04:34] \o/ [07:42:22] 10Data-Persistence-Backup, 10database-backups, 10Goal, 10Patch-For-Review: Upgrade pending stretch backup hosts to buster - https://phabricator.wikimedia.org/T280979 (10jcrespo) [07:42:42] 10Data-Persistence-Backup, 10database-backups, 10Patch-For-Review: Put db2100 back into service after hardware maintenance - https://phabricator.wikimedia.org/T284980 (10jcrespo) 05Open→03Resolved This is solved, going back to the status before the hw issue. [08:11:16] 10Blocked-on-schema-change, 10DBA: Schema change for renaming several indexes in logging table - https://phabricator.wikimedia.org/T270620 (10Ladsgroup) Okay this is interesting. The code for this is actually explicitly set to check which index exists and ignore that instead. The comment for using this index i... [08:11:39] 10Blocked-on-schema-change, 10DBA: Schema change for renaming several indexes in logging table - https://phabricator.wikimedia.org/T270620 (10Ladsgroup) The code that checks is `$index = $this->mDb->indexExists( 'logging', 'times', __METHOD__ ) ? 'times' : 'log_times';` [08:43:38] 10Blocked-on-schema-change, 10DBA: Schema change for renaming several indexes in logging table - https://phabricator.wikimedia.org/T270620 (10Marostegui) I am going to double check across big wikis to see what the optimizer is doing, but yeah, likely it isn't needed anymore [08:54:39] 10Blocked-on-schema-change, 10DBA: Schema change to make rc_id unsigned and rc_timestamp BINARY - https://phabricator.wikimedia.org/T276150 (10Marostegui) db1123 (old s3 master) done [08:54:52] 10Blocked-on-schema-change, 10DBA: Schema change to make rc_id unsigned and rc_timestamp BINARY - https://phabricator.wikimedia.org/T276150 (10Marostegui) [08:56:12] marostegui: I'll fix the rev_page index mess ASAP [08:56:54] 10Blocked-on-schema-change, 10DBA: Schema change for renaming several indexes in logging table - https://phabricator.wikimedia.org/T270620 (10Marostegui) Tested the following wikis: - enwiki - commonswiki - dewiki - eswiki - wikidatawiki The optimizer is doing the right thing, so it is probably ok to remove t... [08:56:55] Amir1: I just commented on the task [08:56:58] ah, there ^ [08:57:33] Thanks [08:57:39] I fix logging too [09:26:38] 10DBA, 10Patch-For-Review: Switchover s5 from db1100 to db1130 - https://phabricator.wikimedia.org/T284529 (10Marostegui) [09:58:18] marostegui, sobanski: yikes. last parsercache purge took 67h [09:58:30] (added as item for discussion in meeting) [10:42:48] kormat: can this be closed? https://phabricator.wikimedia.org/T283239 [10:44:04] all tasks can be closed [10:44:11] that one especially [11:00:01] 10DBA: db-replication-tree doesn't support circular replication - https://phabricator.wikimedia.org/T283239 (10Marostegui) 05Open→03Resolved a:03Kormat This is done - thanks Stevie [11:15:18] 10Blocked-on-schema-change, 10DBA: Schema change for watchlist.wl_notificationtimestamp going binary(14) from varbinary(14) - https://phabricator.wikimedia.org/T268392 (10Marostegui) [11:15:46] 10Blocked-on-schema-change, 10DBA: Schema change to turn user_last_timestamp.user_newtalk to binary(14) - https://phabricator.wikimedia.org/T266486 (10Marostegui) [11:15:52] 10Blocked-on-schema-change, 10DBA: Schema change for dropping default of img_timestamp and making it binary(14) - https://phabricator.wikimedia.org/T273360 (10Marostegui) [11:16:12] kormat: all schema changes applied to db1123 [11:20:21] marostegui: excellent [11:33:27] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) [11:33:38] 10DBA: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 (10Kormat) [11:34:13] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Kormat) 05Open→03Resolved All done :) [11:41:09] Amir1: for https://phabricator.wikimedia.org/T283499 you might want to also check: https://codesearch.wmcloud.org/search/?q=page_timestamp&i=nope&files=&excludeFiles=&repos= [11:41:21] I guess some queries are specifically looking for page_timestamp? [11:42:21] yeah, it has a lot of false positives [11:43:54] it doesn't seem to have anything problematic [11:50:09] Ok :) [11:51:13] 10Blocked-on-schema-change, 10DBA: Schema change for renaming page_timestamp index on revision table to rev_page_timestamp - https://phabricator.wikimedia.org/T283499 (10Marostegui) a:03Marostegui [11:51:18] Amir1: Going to alter two hosts on enwiki [11:51:21] And see what happens [11:51:31] fingers crossed [11:56:35] 10Blocked-on-schema-change, 10DBA: Schema change for renaming page_timestamp index on revision table to rev_page_timestamp - https://phabricator.wikimedia.org/T283499 (10Marostegui) Altered db1099:3311 and db1135 on enwiki, let's see if we find some queries forcing that index. ` root@cumin1001:/home/marostegui... [12:33:36] marostegui: for when you have time https://gerrit.wikimedia.org/r/c/mediawiki/core/+/700515 [12:34:36] Ah cool [12:34:43] I +ed it [12:34:50] +1ed [13:47:48] I don't know if something changed, but buffer pool hit ratio improved on db1157 a lot since around 11:52 UTC [13:52:11] jynus: i'm going to manually move db1171 out from under db1123, as it's starting to alert [13:52:25] ok, do you want me to reset replication? [13:52:42] what do you mean? [13:53:33] "STOP SLAVE; RESET SLAVE ALL" on db1171 [13:53:58] it's your host, if you want to take care of it, be my guest :) [13:54:10] ok, doing [13:56:21] jynus: i'm about to reimage db1123, for context [13:56:39] yeah, saw that, that is why I proposed to unconnect them [14:01:40] 10DBA, 10Patch-For-Review: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 (10ops-monitoring-bot) Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts: ` ['db1123.eqiad.wmnet'] ` The log can be found in `/var/log/wmf-auto-reimage/20210621140... [14:06:19] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10LSobanski) We're back to 67 hours today. [14:31:52] 10DBA, 10Patch-For-Review: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 (10ops-monitoring-bot) Completed auto-reimage of hosts: ` ['db1123.eqiad.wmnet'] ` and were **ALL** successful. [14:41:23] 10DBA, 10Patch-For-Review: Upgrade s3 to Debian Buster and MariaDB 10.4 - https://phabricator.wikimedia.org/T283131 (10Kormat) db1123 is reimaged to buster, `mysqlcheck --all-databases` running now. As this is s3, this is going to take A While. [14:45:28] 10DBA, 10Patch-For-Review: Switchover s3 from db1123 to db1157 - https://phabricator.wikimedia.org/T284648 (10Trizek-WMF) [18:05:53] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Krinkle) p:05High→03Unbreak!