[10:17:55] Which SRE team owns wikitech-static? T376400 needs allocating a suitable tag [10:17:55] T376400: Redesign wikitech-static - https://phabricator.wikimedia.org/T376400 [10:19:22] I'd guess infrastructure foundations... [10:19:25] no [10:19:26] no [10:19:32] also that task has no conent [10:19:42] so I'd ask the author what they have in mind [10:20:19] currently noone really owns wikitech-static TTBOMK [10:20:49] wikitech-static is a mix of things, the wikitech-static part that is mostly unowned, the meta-monitoring bits that are owned by o11y, the status-page redirect that was setup by chris [10:20:56] see https://wikitech.wikimedia.org/wiki/Wikitech-static [10:22:01] I can use the SRE-unused tag and seek clarification from the submitter in the mean time [10:22:11] unowned, even, more coffee required [10:24:34] Emperor: that's a great typo, if we only could consider unused anything that is unonwed and then just undeploy them it would be perfect :D [10:24:35] yeah, I had a look through that article first to see if it would provide an obvious answer. It didn't :) [10:24:43] volans: ISAGN :) [11:55:52] Did something change during my sabbatical? [11:55:53] [11:55:27] marostegui@puppetmaster1001:~$ sudo -i puppet-merge [11:55:53] To ensure consistent locking please run puppet-merge from: puppetserver1001.eqiad.wmnet [11:55:53] [11:55:30] marostegui@puppetmaster1001:~$ hostname [11:55:53] puppetmaster1001 [11:55:58] Yes [11:56:03] puppetserver1001 [11:56:07] (I think) [11:56:12] Ah yes, i am stupid [11:56:14] Oh yes, it says so in the comment :D [11:56:17] (so that's hasn't changed) [11:56:27] thanks sobanski [11:56:59] I forwarded you the relevant email from Luca so it should be at the top of your inbox [11:57:47] marostegui: yes puppet private and puppet-merge have been moved to puppetservers [11:57:56] as we're getting rid of old puppetmasters [11:58:53] ObEarWorm: https://www.youtube.com/watch?v=E0ozmU9cJDg [12:02:58] marostegui: the same applies to commits for the private repo, BTW. These are now also happening on puppetserver1001 [12:04:02] the path for that is now /srv/git/private, the only main difference [12:04:57] Ah thank you guys [12:09:03] just got the email sobanski thanks, it was indeed sent during the sabbatical :) [12:50:12] <_joe_> marostegui: also next week no more requestctl sync, but I guess you don't use that as much :D [12:51:17] _joe_: I saw an email there from you with breaking changes, is that it? I haven't read it yet [12:51:34] <_joe_> yep [15:03:26] hnowlan: is thumbor entirely OK? since 2nd Nov, there's been a small rise in 503s, which I think is coming from thumbs rather than swift itself cf https://grafana.wikimedia.org/goto/M6-o-qWNg?orgId=1 [15:06:00] There's a corresponding rise in 5xx from thumbor in eqiad [15:07:28] would seem to be one pod in particular misbehaving [15:11:16] I'm gonna kill it and let the deployment recreate it to see if it recovers [15:11:34] TY [15:22:55] XioNoX: should we merge T378744? [15:23:01] T378744: GeoDNS: consider sending CN to eqsin - https://phabricator.wikimedia.org/T378744 [15:23:23] as long as you and/or topranks are around, that is [15:26:04] sukhe: they're both in the I/F meeting right now, we just talked about it though, lgtm [15:26:20] oh thanks cdanis [15:27:00] ok, merging and we can keep an eye out if it helps resolves the ulsfo link saturation [15:30:41] claime: thanks for handling that [15:31:52] that is https://phabricator.wikimedia.org/T374350, which is getting more frequent it seems. Unfortunately I think it's something that came out of the bookworm update which is going to be painful to debug [15:32:03] so I'll try to spend some time looking at this in more detail during the week [15:32:31] ack, for now it seems to have remediated the immediate issue [15:32:35] hopefully just some kind of haproxy behaviour we can tweak [15:34:13] sukhe: thanks, will take a look at the graphs later try to guage how it's changed things [15:34:28] topranks: thanks, merged [15:55:42] if there's a runbook for the hard-of-thinking swift admin to handle this sort of thing and you'd rather I did that than kept bugging you, do point me at it :) [15:58:00] good idea, I'll put something on to wikitech after the ritual [16:08:11] Emperor: I basically checked this graph for one pod having an unusual error rate https://grafana.wikimedia.org/goto/7ob7sqZNg?orgId=1 [16:10:18] <_joe_> we should probably add an alert :) [16:11:16] I would then need to remind myself how to connect to the right k8s cluster and zot the correct pod (see above re hard-of-thinking swift admin) [16:21:55] <_joe_> Emperor: looks like a skill worth sharpening a bit, like it is for me to know how to depool a db replica [16:28:48] I don't disagree, it's just it's something I do so infrequently that I'm starting from scratch each time [16:48:44] idm seems to be down? [16:48:50] https://idp.wikimedia.org/ [16:48:56] > upstream connect error or disconnect/reset before headers. reset reason: connection failure [16:49:05] yep seems so [17:02:52] IDP or IDM? [17:02:58] * Lucas_WMDE isn’t sure what the difference between them is anyway [17:03:37] ah, IDM’s single sign on button sends you to IDP, nevermind I guess [17:12:14] ^ moritzm: puppet's been disabled on the idp host by you for a while, ok to enable it? [17:18:45] tomcat needs a bump it seems but yeah, Puppet being disabled might also have something to do with it [17:21:13] Hmm, out of memory [17:24:08] IDP is back, I'll go through the logs and try to figure out what happened [17:24:16] thanks! [17:24:34] I mean what happened is that it ran out of memory, but why. [20:27:47] Hey! I have a somewhat urgent access request for ebernhardson to help debug and test issues with dumps. Could someone review and merge T379025 / https://gerrit.wikimedia.org/r/c/operations/puppet/+/1087239 ? [20:33:42] gehel: taking care of it [20:34:42] volans: thanks! [20:36:10] gehel: running puppet now on 'R:Group = "snapshot-admins"', it matches snapshot[1010-1017] [20:36:26] sounds correct? [20:36:32] yes it does [20:37:33] gehel: all done, he should be able to access now [20:38:38] Big thanks! [20:38:50] anytime :) [20:46:52] mutante, denisse: I'm going to bounce Cassandra on sessionstore to pick up a JDK update. Expecting zero impact, but it's sessions, so figure you ought to be aware! [20:47:41] urandom: ack. Thanks!