[01:09:45] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db1117 is CRITICAL: 12.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321
[01:10:01] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db2160 is CRITICAL: 10.8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321
[01:11:19] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db1117 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321
[01:11:35] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db2160 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321
[05:21:25] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db1117 is CRITICAL: 73.8 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321
[05:21:53] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db2160 is CRITICAL: 87 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321
[05:23:37] <icinga-wm>	 PROBLEM - MariaDB sustained replica lag on m1 on db2132 is CRITICAL: 11.2 ge 2 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2132&var-port=9104
[05:26:49] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db2132 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2132&var-port=9104
[05:31:05] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db1117 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db1117&var-port=13321
[05:31:33] <icinga-wm>	 RECOVERY - MariaDB sustained replica lag on m1 on db2160 is OK: (C)2 ge (W)1 ge 0 https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Replication_lag https://grafana.wikimedia.org/d/000000273/mysql?orgId=1&var-server=db2160&var-port=13321
[09:25:53] <jynus>	 So I checked the Unpollable hosts (last 30m) issue, and the full stack (client and server) seems to be working fine (servers serve metrics, and prometheus collects them) 
[09:47:50] <jynus>	 So I will fine a ticket for obs, as I cannot see anything obvious, not any related puppet change
[09:47:55] <jynus>	 *file
[11:26:57] <Emperor>	 Hm, query-media-file doesn't want to talk to me
[11:37:57] * Emperor emails the author for clues ;-)
[11:48:38] <jynus>	 that needs mediabackup permissions (or sudo as root)
[11:49:06] <jynus>	 "The script should be run as root (technically, the only rights needed are those of the system user: mediabackup: sudo -u mediabackup query-media-file)."
[11:49:23] * jynus tells Emperor to gently read the docs :-D https://wikitech.wikimedia.org/wiki/Media_storage/Backups#Querying_files
[11:50:15] <jynus>	 although I guess I could add a check rather what I guess is an ugly exception
[11:53:03] <Emperor>	 jynus: huh, I was running it with sudo
[11:53:12] <Emperor>	 (I which I think means as root)
[11:53:15] <jynus>	 what error do you get?
[11:53:35] <Emperor>	 the backtrace I emailed you was from running "sudo query-media-file"
[11:54:54] <Emperor>	 If I try and run it with "sudo -u mediabackup" I instead get "PermissionError: [Errno 13] Permission denied: '/var/log/mediabackups/query.log'"
[11:55:58] <jynus>	 interesting, there must be some config or network issue
[11:57:22] <Emperor>	 Same set of errors on ms-backup2002 also
[11:58:46] <Emperor>	 root and mediabackup both work on ms-backup1001; root works on ms-backup1002 but mediabackup doesn't (same error as in codfw)
[12:05:56] <jynus>	 I wonder if this is related to the prometheus issue I filed
[12:06:51] <jynus>	 or it happened because missing grants
[12:07:56] <jynus>	 or maybe a config issue
[12:14:25] <jynus>	 I see, it was both a file permission error on the logs, which were not puppet-handled
[12:14:42] <jynus>	 and a configuration error on the db config, which was still pointing to the wrong db
[12:18:01] <jynus>	 I see, I updated the wrong db to query
[12:18:17] <jynus>	 And changed the mediawiki metadata database instead of the internal mediabackup db
[12:19:36] <Emperor>	 Hm, now it works, but I can't find e.g. the files that Platonides commented on
[12:21:24] <Emperor>	 e.g. https://phabricator.wikimedia.org/P43158#175674
[12:21:35] <jynus>	 yeah, the log, I fixed it by hand
[12:21:57] <jynus>	 I don't usually use the 1002 host, so if run as root the mediabackups didn't have permissions, my fault
[12:22:09] <jynus>	 that needs puppet/package fixing
[12:22:20] <jynus>	 the other codfw is a mistake with grants
[12:22:42] <jynus>	 when setting up the new dbs last quarter
[12:22:53] <jynus>	 I am trying to fix that, it is a mess
[12:22:58] <Emperor>	 thanks
[12:23:03] <Emperor>	 sorry for giving you more work!
[12:23:08] <jynus>	 no, thanks to you for pointing it out
[12:23:21] <jynus>	 we only use that for the things you know about
[12:23:46] <jynus>	 so from one quarter to other, after heavy refactoring, I should have the monitoring to make sure they are working
[12:26:30] <jynus>	 Emperor: codfw works now, but it is very slow- it may be missing more stuff
[12:27:26] <jynus>	 do queries take you a lot of time for you?
[12:27:43] <jynus>	 because it is supposed to answer in miliseconds
[12:37:56] <Emperor>	 I was working on an eqiad node, and it said "no" pretty quickly
[12:38:33] <jynus>	 ok, I think I am getting sometimes bad query plans, so it scanns the full table- making it very slow (it should be fast)
[12:38:49] <jynus>	 it should take generally miliseconds
[12:39:02] <jynus>	 codfw should work now, though
[12:39:16] <jynus>	 I will do some patches to make sure the problems don't reocur
[12:40:25] <jynus>	 sorry about that
[12:41:44] <Emperor>	 NP; there's also evidently some other source of information about media, though, 'cos https://phabricator.wikimedia.org/P43158#175659 knows about e.g. Rapper_Yung_$hade_4.png but when I ask query-media-file I get no matching file found
[12:44:17] <Emperor>	 (or I'm driving it wrong, still)
[12:52:30] <jynus>	 no, the backup keeps its own metadata consistent with the backup time
[12:53:08] <jynus>	 if you want up to date info, you will have to use the classes to query directly to the mediawiki dbs
[12:53:21] <jynus>	 this info is not real time, as it is intended for backups
[12:53:47] <jynus>	 media query tool is for querying backups, not mediawiki
[12:54:52] <jynus>	 the cli, obviously the classes can query real time data; but the query-media-file queries the backup snapshot
[12:55:30] <Emperor>	 ...would that matter here unless the file was very recently deleted?
[12:56:29] <jynus>	 I think if they are long deleted, yes, for one reason
[12:56:38] <Emperor>	 oh?
[12:56:47] <jynus>	 you are querying by container name + path
[12:57:07] <Emperor>	 Mmm
[12:57:10] <jynus>	 if those seem to be deleted
[12:57:28] <jynus>	 the file is renamed on deletion and moved to a different container
[12:57:47] <jynus>	 I mean, "yes (depending on what you are trying to do)"
[12:58:09] <jynus>	 it should show the deleted file if you search by file name
[12:58:42] <jynus>	 but it will be on a container like ....-private <hash>.jpeg
[12:58:52] <jynus>	 and only the metadata will know its original name
[12:58:58] <Emperor>	 the interactive query-media-file isn't offering me filename, just container & full path or Title of the file on upload
[12:59:09] <jynus>	 yes, the first option is filename :-D
[12:59:16] <jynus>	 filename == Title
[12:59:32] <Emperor>	 Hm, well I put in Rapper_Yung_$hade_4.png and it said it didn't find one
[12:59:42] <jynus>	 that's what Platonides did
[12:59:55] <jynus>	 checking mediawiki in a different way
[13:00:29] <Emperor>	 Sorry to keep asking stupid questions, but how can I query mediawiki thus? Presumably there is some tool that I've just not found?
[13:00:30] <jynus>	 let me double check with that example on the live dbs
[13:00:38] <jynus>	 SQL :-D
[13:00:54] <jynus>	 and learning the intricacies of how it works :-P
[13:03:12] <jynus>	 ok, so it is not on backups because it was deleted recently
[13:03:26] <jynus>	 root@dbstore1007[commonswiki]> SELECT * FROM filearchive WHERE fa_name = 'Rapper_Yung_$hade_4.png';
[13:04:02] <jynus>	 filearchive is the table that contains (some) of the deleted files metadata on production
[13:04:52] <Emperor>	 OK
[13:04:53] <jynus>	 sadly there is no api to query non-public mw medatada
[13:05:35] <jynus>	 I asked for one here: https://phabricator.wikimedia.org/T267365 feel free to add to the voices
[13:07:32] <jynus>	 my offering still stands to load your log into the db, then do some join to cross reference the backup or the mw live db
[13:08:52] * jynus hopes Emperor keeps his sanity when he learns about the 5 different ways a file can be deleted, hidden or supressed :-D
[13:08:53] <Emperor>	 My problem at the moment is that there is too wide a gap between what I know "here are a list of 27,143 objects that I need to find out what MW thinks about them" and what I need to know about the various states objects might be in and how one is meant to query them, if you see what I mean?
[13:09:40] <jynus>	 yes, you lack tooling, I 100% agree
[13:09:47] <jynus>	 as in, mw lacks it
[13:09:51] <Emperor>	 Now it sounds like I could make a first pass by trying to look up each path in filearchive and seeing if there's matches
[13:10:03] <jynus>	 so there are 3 tables 
[13:10:26] <jynus>	 image, filearchive and old_image
[13:10:34] <jynus>	 where there can be mw metadata
[13:11:37] <jynus>	 that is the part that I have tried to implement in my own way- but I do long listings, not searches, so not sure if it will be helpful for you
[13:11:48] <Emperor>	 [my normal approach to this sort of thing would be to do a few "by hand" to get a feel for the likely answers, and then try and knock up a script to do the full operation in bulk]
[13:12:12] <jynus>	 the problem is there is too mucha variability on mw internals
[13:13:00] <Emperor>	 right, and likewise no (straightforward) way to ask mw itself what it thinks about an object, so you end up trying to infer what mw might think from what database entries there are?
[13:13:21] <jynus>	 the answer is more like: if you find it, it is ok
[13:13:33] <jynus>	 if you don't find it "who knows?"
[13:14:12] <jynus>	 this is what Amir1 said about lack of maintenance on mw file storage- there are dragons here
[13:15:57] <jynus>	 the data model is very very complex- that is why I have to cleaned up version- in the form of backups, if that helps you
[13:16:12] <jynus>	 but obviusly that is not real time by a long way
[13:18:11] <Emperor>	 Mmm
[13:18:49] <jynus>	 alternatively, you can use the code as documentation of every single mw exception I have found so far
[13:18:56] <Emperor>	 I should go and patch our rclone, but. Is there a doc somewhere of what image / filearchive / old_image contain/mean?
[13:19:09] <Emperor>	 jynus: "yay" :)
[13:19:37] <jynus>	 there is the reverse engineering I did for backups design, if that helps
[13:20:46] <jynus>	 check if it helps: https://docs.google.com/document/d/1kmaDIrae4HsE1w8x7xRxM_Hv2-gO0VNTvif240aQW4A/edit#heading=h.qyjlov6tc6dx
[13:21:15] <jynus>	 section "Current MediaWiki Swift Storage system"
[13:21:55] <Emperor>	 TY
[13:22:22] <jynus>	 but the code will have the most infimous details
[13:22:35] <jynus>	 e.g. wikimedia commons is wikipedia commons
[13:22:39] <jynus>	 on swift, etc
[13:23:41] <jynus>	 I would also like if the code you do is shared so we can both reuse it
[13:24:45] <jynus>	 I am not responsible for your sanity after reading that, though :-)
[13:25:27] <Emperor>	 I think I need to first patch rclone to address https://phabricator.wikimedia.org/T327269 ; then I'll have a poke round and see if I can reproduce the answers on https://phabricator.wikimedia.org/P43158 
[13:25:31] <jynus>	 I've started to think that storing files with names as hashes in a custom base36 wasn't that bad
[13:25:39] <Emperor>	 I suspect this is going to take me A While
[13:25:50] <jynus>	 I am here for you, brother
[13:30:31] <jynus>	 these queries may also help to understand the meanings of the metadata: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/mediabackups/+/refs/heads/master/mediabackups/MySQLMedia.py#34
[13:33:53] * Emperor takes notes :)
[15:44:55] <dhinus>	 jynus: yesterday I managed to take a complete backup of toolsdb with mariabackup, and only 10 minutes of read lock...
[15:45:08] <dhinus>	 I immediately ran into another problem though (see the last comment in T301949)
[15:45:09] <stashbot>	 T301949: ToolsDB upgrade => Bullseye, MariaDB 10.4 - https://phabricator.wikimedia.org/T301949
[15:45:45] <dhinus>	 have you ever seen a similar error while running '--prepare'?
[16:24:04] <jynus>	 yes, when there was data corruption or a software bug
[16:33:21] <dhinus>	 can the backup complete successfully if there is corruption in the database? or do you think the corruption is only in the backup that was generated?
[16:36:10] <jynus>	 You are asking very specific questions I may now know the answer, but to the best of my ability: yes / I don't know
[16:36:27] <jynus>	 if I had to guess is the same issue as before, just showing on the preparation phase
[16:36:36] <dhinus>	 interesting, thanks
[16:37:01] <jynus>	 the thing is mariabackup works by copying the db uncleanly
[16:37:18] <jynus>	 then peforming the backup recovery at --prepare phase
[16:38:13] <jynus>	 so if there is an inconsistency- or there is not one, but a software bug / issue like the one discussed, it doesn't matter when the problem shows up
[16:38:30] <dhinus>	 yes makes sense
[16:38:32] <jynus>	 it is a bit random, depending on the traffic
[16:38:52] <jynus>	 but if it fails like that, I wouldn't trust the result
[16:39:45] <dhinus>	 yes agreed, do you know if there's a way to prevent DDL statements for a few hours while the backup is running? (hoping that those stmts are the issue)
[16:39:50] <jynus>	 not sure if an alternative method (snapshots + recovery) could help you?
[16:39:57] <jynus>	 or you may find the same issue
[16:40:12] <dhinus>	 hmm define 'snapshots'?
[16:40:27] <jynus>	 at this moment I would be worried enough to create a logical backup of as much data as possible
[16:40:53] <dhinus>	 a big mysqldump?
[16:40:56] <jynus>	 and start setting up a replica in pieces even if the process takes a few days
[16:41:18] <jynus>	 not with taht tool, as it is toooooo slow, but yes, e.g. database by database if needed
[16:41:44] <dhinus>	 yeah I don't think mysqldump would finish in a reasonable time... which tool would you consider?
[16:42:15] <jynus>	 the one we use on production, preciselly because we had corruption / bugs in the past in parallel to mariabackup is called mydumper
[16:42:56] <jynus>	 it uses compression and parallelization, so it is still slow (export + import) but is usually more reliable in those cases
[16:43:15] <jynus>	 and much faster than mysqldump
[16:43:47] <jynus>	 it's on Debian, should work well for 10.1
[16:44:04] <jynus>	 I would try the snapshoting first
[16:44:42] <jynus>	 which is basically freezing on a copy the filesystem (at lvs level or not sure if your vm allows you to do that too)
[16:45:05] <dhinus>	 yeah I saw the LVS option but I'm not sure it's supported, I can check though, it might be
[16:45:10] <jynus>	 then copy the snapshot away and discard it, then recover it 
[16:45:30] <jynus>	 for innodb should work, but maybe you will hit the same issue
[16:45:45] <jynus>	 but it will be much faste than a logical recovery
[16:46:07] <jynus>	 sorry, lvm
[16:46:59] <dhinus>	 yeah lvm
[16:47:00] <jynus>	 I mixed my acronyms as I am lately talking more about lvs than lvm :-D
[16:47:10] <dhinus>	 hahah same
[16:47:53] <jynus>	 if you have space, it should be always available- that if fixes your problems is a different issue :-D
[16:48:05] <jynus>	 (if you are using lvm , I mean)
[16:48:57] <jynus>	 but if your question is "what would you do in my position"- try getting a snapshot and recover the copy elsewhere, then try mydumper as the last option
[16:50:00] <dhinus>	 thanks, that's very helpful!
[16:50:17] <dhinus>	 I will try the snapshot option first
[16:50:25] <jynus>	 sorry, it is an ugly position you are
[16:51:04] <jynus>	 you are right a snapshot if you have full or almost full partition utilization and heavy IO you may run of space first
[16:51:13] <jynus>	 *out of
[16:52:25] <jynus>	 the key blocker there was losing the replica :-(
[16:52:25] <dhinus>	 yes, I have space on a separate volume, but I don't think I have space for a full copy on the same volume
[16:52:41] <jynus>	 dhinus: oh, you don't need to have double the capacity
[16:52:45] <jynus>	 only for the diffs
[16:52:50] <jynus>	 while the snapshot is active
[16:52:51] <dhinus>	 right, so maybe it's possible
[16:53:04] <dhinus>	 snapshot + copy to other volume + delete snapshot
[16:53:06] <jynus>	 it will depend on how fast you can copy out the snapshot
[16:53:16] <jynus>	 and how much io you have
[16:53:47] <dhinus>	 re: losing the replica, yes I agree that was ages ago, but I want to get into a state where we have a complete and working replica, because that's the only way to avoid similar issues in the future
[16:54:40] <dhinus>	 I think that will mean splitting out some of the databases, because when the replica was working, there were many issues and that led to a few databases getting excluded from replication
[16:55:07] <jynus>	 in both cases you could play with filters
[16:55:08] <dhinus>	 I expect those issues will happen again, but excluding some dbs from replication is not a good long-term strategy :)
[16:55:24] <jynus>	 so the place were you are setting the new instance
[16:55:30] <jynus>	 start replicating from the original
[16:55:46] <jynus>	 ignoring some dbs
[16:55:57] <jynus>	 that way you can do it in "pieces"
[16:56:07] <jynus>	 (be it on a single db or several)
[16:56:10] <dhinus>	 then I could backup those dbs from the old primary, to a separate instance only for that db
[16:56:26] <dhinus>	 yes that's more or less what I was thinking
[16:56:28] <jynus>	 yeah, in any method you want
[16:56:44] <jynus>	 or more specifically, in any that works :-S
[16:56:49] <dhinus>	 heheheh :D
[16:56:56] <jynus>	 sorry I am not of much help
[16:57:00] <dhinus>	 thanks, I have a couple meetings now, but will play with this tomorrow
[16:57:09] <dhinus>	 you've been _very_ helpful actually :)
[16:57:13] <jynus>	 I've been there and the key is getting out in some way
[16:57:24] <jynus>	 then getting healthier
[16:57:30] <jynus>	 mostly with trial and error
[16:57:33] <dhinus>	 makes sense