[00:32:47] PROBLEM - MariaDB sustained replica lag on s4 on db1141 is CRITICAL: 4.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1141&var-port=9104 [00:34:03] RECOVERY - MariaDB sustained replica lag on s4 on db1141 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1141&var-port=9104 [06:19:04] I'm going to restart MySQL on tendril host as the memory usage is very high and it will oom during the weekend anyways [06:37:10] marostegui: I have a better idea. hear me out :D [06:49:32] Amir1: does your idea involve the word "kill"? :) [06:50:34] yup [10:07:16] I'm going to do an upgrade/restert of db1150, yesterday its network was a bit unstable [10:08:50] has 10.4.22 package found any issues or should I not install it yet on backup sources? [10:28:31] go for it [10:34:53] great :-) [10:37:26] I'm about to remove another set of useless GRANTs: When the database implied for the grant doesn't exist in the host. T296274 [10:37:26] T296274: Clean up wikiadmin GRANTs mess - https://phabricator.wikimedia.org/T296274 [10:38:54] let me know if anything you want me to consider [11:40:10] recovery of db1139 (s1) finished, but the current master at the time was db1163.eqiad.wmnet, should I connect to it and later change it to the current one, or should I find the equivalent coords on db1118? [11:41:43] I think I prefer to connect it to db1163 using binlog pos, and when it catches up, move it [11:41:57] and then enable gtid, to prevent issues [11:46:03] So the plan is: depool temporarilly, db1163, start replication, once binlogs have been transfered, repool db1163, wait for catch up, move db1139 to replicate from db1118 [11:46:04] marostegui: I think I fixed most of the grants. Can you take at the list. It's basically two issues: some hosts don't have the correct users and db1105 has a different grant from the rest of the infra [11:46:49] db1105 has a different heartbeat grant [11:47:29] I should now take a look at wikiuser later :(( [11:47:36] I can take a look yeah [11:47:43] Amir1: so wikiadmin no longer has drop? [11:48:24] it does expect on one host (waiting for a bit before moving on to the rest of the infra). I fixed the useless GRANTs [11:48:34] ah sure good [11:48:44] ok for me to depool temporarilly db1163? [11:49:00] fine from my side [11:49:13] I'm doing stuff on s4, if it's not s4, it's fine for me [11:49:18] it is s1 [11:49:46] I will do it on steps, so I don't move a lot of traffic away instantly [11:49:50] Amir1: no worries at some point you will know where each host belong without looking it up [11:50:00] lol [11:50:15] I'm already good at picking up the section from the db [11:50:17] That DB memory palace [11:50:34] frwiki => s6, arwiki => s7 [11:50:53] Amir1, as I am rebuilding db1139 literally from scratch, I will have to give it grants, will catch up with you on that when I set those up [11:50:56] Amir1: not too bad! [11:51:36] for now I am just setting up replication which will make sure data is working fine [11:51:50] marostegui: we should have a pub quiz for it in some offsite, with trick questions like "where is mgwiktionary?" [11:52:04] hahaha [11:52:06] jynus: awesome. keep me in loop [11:52:30] how is that a trick question? [11:53:55] mgwiktionary looks like a small wiki (so probably in s3) but it's actually in s5 [11:54:14] because: 1- it's big 2- we are moving some small wikis (and new wikis) to s5 [11:56:23] maybe mgwiktionary is in s7, let me look it up [11:57:20] nope, s5 [12:00:00] I think it's an hour now, the traffic is not draining fully from the depooled s4 replica, it's just one mwmaint script. ugh. [12:00:19] whatever, it's fully automated on a screen. I go grab a coffee [12:18:26] db1163 fully depooled, sending binlogs to db1139 and replication flowing [12:21:42] I will also have to do a comparison with the primary, as it may have skipped some schema changes or other things [12:27:22] when your lag is measured in weeks on grafana: "Max Lag: 3.35 week" [13:10:17] it will take one extra hour for binlogs to be sent [14:03:03] marostegui: am I OK to reboot db1124 please? I want to see how mariadb and the prometheus exporter behave on startup [14:03:13] Emperor: yeah, let me downtime it for a week [14:03:33] done, you can go ahead as you want, if you can please run systemctl stop mariadb [14:03:49] it is ok if you don't either, there's no data or anything valuable there [14:03:56] done. [14:03:59] thanks! [14:05:07] marostegui: how this look? https://logstash.wikimedia.org/app/dashboards#/view/43fcccd0-4df5-11ec-81e9-e1226573bad4?_g=h@0e494b1&_a=h@0900ee8 [14:05:13] checking [14:05:28] I guess a bunch of Error restoring state from URL [14:05:29] Unable to completely restore the URL, be sure to use the share functionality. [14:05:50] hmm, search in dashboards "slow queries" [14:05:56] yeah, got it now [14:06:03] https://logstash.wikimedia.org/goto/85a50bf579e720cdc2358b124dd9d9f7 [14:06:59] Amir1: this looks very good, quick question, is there a way to see on which host the query was executed? [14:07:10] that'd be important to have if we can extract it [14:07:17] sure, let me take a look [14:07:29] ie: one host might have optimizer issues, or whatever, and could be the only one giving problems [14:07:51] This shows all the queries taking longer than 5 seconds? [14:08:31] read queries, for write the threshold is higher [14:08:45] it's not in the logs, but I can add the host, it should be easy [14:09:12] I think it should be the other way around no? Write queries have a max of 3 seconds (I believe) and reads 60 (for the query killer to kick in) [14:10:29] I don't think the max is 3 for killing, it's for logging, you're right it's smaller for logging (5 vs 3) but I can't see anything that would kill it [14:10:46] but yeah, it logs for both [14:11:07] I have seen rollbacks after 3 seconds for writes (or 5 or something), no? [14:11:09] let me search [14:11:23] w/index.php?title=Special:Undelete&action=submit Wikimedia\Rdbms\DBTransactionSizeError: Transaction spent 4.9288656711578 second(s) in writes, exceeding the limit of 3 [14:11:33] I also want to add "sum time", these metrics are number of slow queries [14:11:48] that's logged, not rollbacked [14:12:05] Interesting, I always thought we'd rollback that [14:12:10] so there's no limit? [14:12:21] AFAIK none [14:12:31] at least I didn't see it in the code [14:13:20] Then there's probably none :) [14:14:36] Ah, I understand at least part of the pme problem on db1124 [14:15:12] Emperor: hit me! [14:15:19] * kormat books a flight [14:15:32] kormat: make sure to have the negative PCR! [14:15:44] whatever it takes [14:15:53] kormat: going somewhere fun? [14:16:05] 15:15:13 Emperor: hit me! [14:16:11] marostegui-ville [14:16:13] Emperor: come on... [14:16:19] I thought you'd be faster [14:16:23] marostegui: the changes we made in https://gerrit.wikimedia.org/r/c/operations/software/+/715926 are missing from the mariadb.service file on e.g. db1124 (I'm just updating the ticket) [14:16:33] kormat: OIC [14:16:54] Emperor: Oh interesting [14:17:01] Emperor: So it should affect also buster hosts, no? [14:17:48] anywhere without that fix [14:18:06] I hadn't realised we took the .service file from different places on different distros... [14:18:49] marostegui: db1118 is a buster host and has the fix [14:19:01] mmm [14:19:05] How's that possible? [14:19:17] i wonder who built the packages [14:19:40] Ah shit! I maybe didn't run git pull XD [14:19:45] let me see [14:20:40] 10.4.21-2 should have the patch in, per T289488 [14:20:41] T289488: Systemd enhancements for mariadb and prometheus-mysql-exporter - https://phabricator.wikimedia.org/T289488 [14:21:06] yeah, but I am checking if I didn't rebase my dbtools dir on my bullseye installation [14:21:07] let me seee [14:22:37] if you don't _want_ my changes ;-p [14:22:55] Emperor: beautifully played :) [14:23:29] * Amir1 grabs popcorn [14:24:30] Amir1: i like the inserts into `blobs_cluster27` queries on that logstash dashboard :) [14:24:55] lovely [14:25:01] so I did run git pull [14:25:05] it gzips them that's why [14:25:11] But looks like I built the package from the old dir XD [14:25:25] marostegui: hahah, doh [14:25:26] I am going to recreate the packages for bullseye then! (buster one is fine) [14:25:40] inserts to es shouldn't be slow though [14:25:42] Cool, that should hopefully fix it then. This is why we test :) [14:25:48] thanks Emperor! [14:25:55] marostegui: i do love when i can be helpful _and_ snarky at the same time ;) [14:26:03] marostegui: I'm always happy to explain why things are your fault ;-) [14:26:03] kormat: I never said you were helpful [14:26:26] Emperor: this would have never happened on a mac [14:27:09] lol [14:27:22] marostegui: now mac, what's next? emacs? [14:27:37] That's scary, I never realised I was so close to emacs [14:28:03] Amir1, Slow queries looks super nice and promising! [14:28:26] ^^ [14:28:46] indeed the total time would be nice, if posible to order by that [14:30:27] yeah, I'm on it [15:22:55] marostegui: jynus so for the summing problem, it's that very likely "actual" is set to string in another part of core, in order to avoid this, I'm renaming the field (suggestion by Filippo). Here's the patch for adding db and fixing sum thingy https://gerrit.wikimedia.org/r/c/mediawiki/core/+/741927 [15:23:04] now the lovely question of who would review it [15:28:53] I was at meeting, let me read you [15:29:38] :-(, cannot kibana solve it on frontend side? [15:30:15] I know grafana can do some ugly hacks, but it's been a lot since I did dashboards on kibana [15:30:30] no but that's fine. We control both sides [15:30:46] yeah, but more work, more changes, more things that can break :-( [15:30:56] Amir1: Wow, I thought it would have been solvable on the UI itself :( [15:31:27] geotags table in dewiki is lovely, has 2M rows but it's autoincrement value is on 2.4B and will run out of auto_increment soon [15:31:39] ha ha [15:32:14] marostegui: the db host is definitely not fixable in UI. while we are at it, we can fix sum too [15:32:14] if ids are not used logically, that could be fixed by pushing, like in the Tokyo subway [15:32:17] define soon! [15:32:59] I think it's doing upsert instead of update + insert, each upsert wastes an autoincrement id [15:33:07] I'll check [15:33:19] jynus: lol [15:34:31] it is also unsigned, so that is an easier fix [15:35:09] I may be wrong, there is a db-backed.sql and externally-backed.sql [15:35:32] ah, I think it is signed, the other is for elastic [15:36:49] it's unsigned [15:37:01] | gt_id | int(10) unsigned | NO | PRI | NULL | auto_increment [15:37:12] so the code is not well maintained :-) [15:37:30] *pretends to be shocked * [15:37:52] no, I was wrong, "int unsigned" I just got both definitions confused [15:45:45] I am going to start repooling db1163 slowly [16:00:49] is anyone editing the mysql dashboard- I am getting some weird layouts, but may be just me [16:01:51] https://phab.wmfusercontent.org/file/data/tffjbfdmx6masbj2mcdq/PHID-FILE-w2xshydp3r6bhzuqzf3l/Screenshot_from_2021-11-25_17-01-20.png [16:05:48] jynus: i get the same layout yeah [16:08:15] the next point in my todo list for today is FlaggedRevs db issues, I don't have the emotional capacity to work on it :/ [16:09:18] FR stands for FuckedRevs [16:11:40] db1139:s1 should be almost production-ready now that I recreated it on tendril (and is on orchestrator) [16:12:05] other than the lag, and the wiki accounts, that I will wait to see what's the canonical setup for those