[07:30:06] Thanks RhinosF1, we'll take care of that (cc: moritzm) [07:32:24] RhinosF1: thanks for the note; I've just blocked the account: https://idm.wikimedia.org/wikimedia/log/ [07:33:37] elukey: FYI I'm going to pick up from where btullis left off yesterday with the an-test-druid1001 reimaging. I'll let you know whether it worked or not [07:33:40] Thanks moritzm jayme [08:14:48] elukey: the cookbook seems to be running just fine. We're down to the last reboot and puppet run [08:33:00] mathoid has been having way more traffic than usually https://grafana.wikimedia.org/goto/NmjRRqVvR?orgId=1 [08:33:09] working on finding out what [08:55:43] 600rps? [08:55:54] sorry, +600rps [08:56:21] actually no, usually it's around 5 or so [08:56:32] this is +800 rps and rising [11:05:36] Hi folks, I just merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/1225526 that disables HTTP redirects to the Swift backend for all the registry hosts. This should help reducing issues when pushing images, but if you see other weird errors please let me know. [11:05:41] cc: dancy, hashar [15:07:41] urbanecm: I am reading your task [15:08:30] marostegui: you're very quick! thank you ❤️ [15:08:43] urbanecm: So essentially all the wikis on wikipedia.dblist right? [15:08:46] yeah [15:08:50] ok let me check [15:09:32] (if the wiki is in `closed.dblist` or `private.dblist`, it wouldn't need it, but it can also just exist there empty, i'll leave that up to you) [15:10:49] yeah, i am checking the wikis that do not have it [15:11:52] marostegui: i'm happy to temporarily disable the feature if needed. i'm not sure how quick the fix would be. [15:12:00] it should be quick [15:12:08] thank you [15:13:07] urbanecm: Is it expected that some wikis do not even have the table? eg: kncwiki? [15:13:47] This is the list of wikis with no column, I will quickly create it: https://phabricator.wikimedia.org/P87471 [15:13:52] You can also see the wikis withouyt the table [15:14:21] marostegui: at first sight, that sounds unexpected. i'll check that part. [15:14:36] ok, I will get the column added to the other ones [15:16:58] urbanecm: running the loop to add the column [15:17:03] thank you [15:19:19] urbanecm: done, I am now double checking only wikis with no table show up [15:20:29] all of those wikis are very new. it seems https://phabricator.wikimedia.org/T364308 is no longer working for some reason. [15:21:25] urbanecm: right, I see - on my side all is good again: https://phabricator.wikimedia.org/P87471#351924 [15:21:28] in any case, it is not being accessed there (yet). i'll fill a task for figuring out why https://phabricator.wikimedia.org/T364308 is not working, and i can create the tables once needed myself. [15:21:46] marostegui: thank you VERY much for such a quick fix! [15:21:47] cool [15:21:51] no problem! [15:59:42] elukey: Thanks! (Regarding Registry HTTP redirects) [16:07:58] dancy: hope it will have a good effect! [16:43:34] to the on-callers (and also dancy) - I restarted the docker-registry-swift daemons on registry* to pick up my earlier change, apparently puppet's refresh didn't really work. All good from logs etc.. pov, lemme know if you see anything weird [16:47:28] uh, for some minutes now, I haven't been able to load anything from thanos -- not even the past 1h of graphs on the front page of grafana (mostly from recording rules) [16:47:35] is that just me, or [16:48:05] * hnowlan looking [16:48:52] cdanis: I saw some titan nodes having troubles in #operations [16:50:38] I'm looking as well. [16:51:09] titan1002 seems dead, queries using ridiculous cpu on titan1001 [16:53:44] titan1002 is back [17:25:56] we're out of the woods, still investigating what caused the disproportionate impact [17:28:49] <_joe_> s/what/who/ [17:29:08] <_joe_> whoever made the query-of-death will be put to the gallows [17:31:08] I'm slightly worried it was me trying to load 90d of the mediawiki backend performance dashboard 😅