[00:05:42] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10tstarling) @dpifke mentioned table partitioning. If we wanted to rearchitect it more aggressively, t... [00:13:41] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10tstarling) If DROP TABLE can be done without stalling the whole server, we could just use table pref... [01:30:52] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10Parsoid (Tracking), 10Patch-For-Review: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10aaron) @Marostegui I was curious if adjusting MERGE_THRESHOLD would help (e.g https://dev.mysql.com/... [04:41:51] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [04:43:18] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [04:53:36] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) GTID disabled on es% and s% ` db2112 Using_Gtid: No db2107 Using_Gtid: No db2105 Using_Gtid: No db2090... [04:53:55] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [04:57:37] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [05:31:51] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [05:40:53] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [05:44:36] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [05:55:21] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [06:04:28] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [06:12:10] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [06:33:59] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [06:40:04] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [06:45:30] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [06:52:46] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [07:08:24] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) codfw -> eqiad replication has been enabled everywhere: ` # for i in `mysql.py -BN -hdb1115 -A zarcillo -e "select instance from masters where section like 'es%' OR section like... [07:08:34] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [07:28:54] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [07:42:25] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [07:56:32] 10DBA, 10Datacenter-Switchover, 10Patch-For-Review: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [08:06:49] marostegui: i'm going to downtime all dbs for 45mins and then merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/701335, just so we have a chance to catch any issues before an alert storm happens [08:07:21] cool! [08:11:13] * volans wonders if systemctl stop icinga would be a simpler API for the downtime in those cases :-P [08:12:23] it's only a mere 216 hosts [08:12:47] a trifling number [08:14:10] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Kormat) [08:14:43] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [08:15:01] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [08:27:54] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [09:03:35] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10MW-1.37-notes (1.37.0-wmf.12; 2021-06-28), and 2 others: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Kormat) The current run has progressed to 10.4% over the course of 19h. That works ou... [09:04:00] marostegui: well, alerting didn't blow up, so that was nice :) [09:04:09] marostegui: i've seen m2 lagging a lot today - any idea why? [09:04:34] oh. ffs. [09:04:47] icinga hasn't updated those checks since yesterday evening... [09:04:52] I haven't checked :( [09:05:24] nevermind. it was a stale icinga page, somehow. doing a full reload made them disappear. [09:05:29] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [09:05:31] I am still having fun with ^ [09:07:43] anything i can do to help/hinder? [09:10:58] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10MW-1.37-notes (1.37.0-wmf.12; 2021-06-28), and 2 others: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Kormat) On the plus side, reducing the sleep duration doesn't appear to have had an i... [09:18:17] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [09:23:41] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) [09:24:19] 10DBA, 10Datacenter-Switchover: Pre DC switchover eqiad -> codfw DB work - https://phabricator.wikimedia.org/T284897 (10Marostegui) 05Open→03Resolved a:03Marostegui This is all done. Weights might need to be adjusted ad-hoc once we start getting live traffic. [13:19:45] 10DBA, 10Orchestrator, 10User-Kormat: Enable report_host for mariadb - https://phabricator.wikimedia.org/T266483 (10Marostegui) Pending eqiad hosts: db1129 (slave) db1122 db1104 [13:29:16] 10DBA, 10MediaWiki-Parser, 10Performance-Team, 10MW-1.37-notes (1.37.0-wmf.12; 2021-06-28), and 2 others: purgeParserCache.php should not take over 24 hours for its daily run - https://phabricator.wikimedia.org/T282761 (10Kormat) @Krinkle : another feature request for purgeParserCache.php would be to print... [13:33:31] hey, I have a charset puzzle that I can't figure out on the cloud replicas. Basically, `select page_title, convert(page_title using utf8mb4) from enwiktionary_p.page where page_id in (1009765,1009801,1009807, 300, 400)` should decode, but only prints out ???? on cloud replicas while it prints the correct string on dbstore1007. (Detailed in https://phabricator.wikimedia.org/T284623) [13:34:09] the weird thing is that `select hex(convert(page_title using utf8mb4)) ... ` DOES work!? [13:35:31] I wanted to submit a patch for the view creation scripts (https://github.com/wikimedia/puppet/blob/production/modules/profile/files/wmcs/db/wikireplicas/views/maintain-views.py#L121) but I couldn't figure out what was wrong and how to specify charset at view-create time [13:37:38] (and lastly, `show create view page;` on cloud replicas shows `character_set_client` set to utf8 (not utf8mb4 or binary as is the case everywhere else) [13:37:55] marostegui any thoughts ^? [13:49:42] milimetric: i had a look with my rudimentary mysql knowledge, but came up with nothing. so defering to marostegui :) [13:55:04] yeah, if he doesn't know right away what's going on, I can dig deeper, but I'm not familiar with this kind of problem either [13:55:50] milimetric: looking at the hex value of the page_title column, it's identical between clouddb and prod [13:56:07] so at least it's not data corruption we're dealing with [14:08:09] milimetric: I was in a meeting sorry, I will take a look later if I can, I am busy [14:08:37] kormat: glad to hear it is the same in production, data corruption came to mind firstly [14:11:56] milimetric: just a random thought without having actually checked it, production tables are binary, so not sure if all these conversion can be messing up things [14:19:33] marostegui: yes, the charset is explicitly set as binary on the prod replicas, and convert(some_column using utf8mb4) works fine there (as in, emojis etc are coming out fine). But somehow the view is inferring utf8 instead of binary on the cloud replicas, so I can't figure out a way to decode strings properly. And indeed, the hex is fine. [14:30:22] milimetric: fwiw, i seem to get the exact same result for both prod and clouddb [14:30:58] https://phabricator.wikimedia.org/P16725 [14:32:09] kormat: oh weird! So then dbstore1007 is the anomaly... That also means that somehow php is reading these fields properly... but how [14:32:25] let me test against dbstore.. [14:32:55] milimetric: i _also_ get the same result from dbstore1007:3312 [14:34:15] kormat: hm... so I get this: https://phabricator.wikimedia.org/P16725#85554 [14:34:29] this might mean the connection is setting some kind of default... [14:34:49] milimetric: check what `\s` says [14:35:22] aha!!! [14:35:22] show variables like '%char%'; [14:36:38] utf8mb4 on my dbstore1007 connection and utf8 on the connection to the cloud replicas. I can set that explicitly in my sqoop script, so I think it solves my issue but we should probably make the cloud tools "sql" shell wrapper thing set the connection properly [14:36:46] do you manage that code or know where it is? [14:37:38] we do not, no [14:37:47] oh duh, I'll talk to the cloud folks [14:39:05] and indeed, if i add `--default-character-set=utf8mb4` to my mysql commandline, then the convert() works too [14:39:51] tl;dr; marostegui, it's just client and connection-level defaults that are messing with the output, data's fine [14:40:00] namely, `set character_set_results=utf8mb4;` fixes the problem for the current session [14:43:55] oh nice [14:44:00] Nice catch stevie! :) [14:45:16] my face has a talent for finding issues in the dark by walking into them blindly [14:45:24] milimetric: glad to hear - sorry I couldn't help, I am busy with stuff that has a deadline tomorrow :( [14:45:58] no problem at all, I'm sorry for the bother and thanks much for the help [14:46:23] milimetric: you can always bother kormat no worries [14:46:35] 🤬 [14:46:51] :) haha, well, I'm still gonna go ahead and tread lightly in this channel, I know how much work yall have [14:46:51] (milimetric : glad i could help & and you're welcome :) [23:33:43] 10DBA, 10SRE, 10Datacenter-Switchover: Figure out how x2 should be handled in DC switchover - https://phabricator.wikimedia.org/T285519 (10Legoktm) p:05Triage→03High [23:54:01] kormat, marostegui: please ping me whenever you're around, I'd like to discuss how x2 should be handled in the switchover ^^