[04:16:43] subbu: the dates on the keys over there are Feb 12 so it's not new at least... and that matches the date of the update on the wiki [11:22:04] 10serviceops, 10observability, 10GitLab (Initialization), 10Patch-For-Review: Define monitoring for gitlab - https://phabricator.wikimedia.org/T275170 (10Jelto) a:03Jelto [14:04:45] any other serviceopsen around? debugging an api latency issue with elukey in -ops, but I have a meeting in 25 minutes I'd like to get to if possible [14:06:16] rzl: in what chan we should follow up? [14:06:57] let's stay in -ops for the actual work, just hoping to get more folks' attention here [14:07:34] rzl: it worked, thanks [14:07:44] 🙏 [14:39:24] leaving a note after the latency issue - we didn't receive, afaics, any alert related to api-appserver latency regressions since this morning (7UTC), but only the php-fpm idle workers ones [14:44:58] 10serviceops, 10SRE, 10Patch-For-Review: bring 43 new mediawiki appserver in eqiad into production - https://phabricator.wikimedia.org/T279309 (10Dzahn) [14:48:29] rzl: sorry I didn't see it earlier [14:49:25] elukey: the thing is that latency didn't spike to unacceptable levels [14:49:32] (generally speaking) [14:50:07] effie: mean latency tripled :D [14:50:18] and p95 went above 2s [14:50:20] I was about to say that, I was writing :p [14:50:28] appserver get is ~250ms to ~350ms but api get is ~100ms [14:50:59] but api get didn;t go higher than ~350ms, and I think we have the same threasholds for those 2 [14:51:46] sure but there was a clear problem ongoing, so we may need to review a little the alerts in my opinion [14:52:38] we could bundle it with the SLO/SLI work, since clearly those 2 services need different threasholds [14:52:48] sure [14:57:04] 10serviceops, 10SRE, 10Release-Engineering-Team (Radar): Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) a:03Dzahn [14:57:15] 10serviceops, 10SRE, 10Release-Engineering-Team (Radar): Upgrade MediaWiki clusters to Debian Buster (debian 10) - https://phabricator.wikimedia.org/T245757 (10Dzahn) mwmaint1002 done [15:06:52] NAMESPACE NAME READY STATUS RESTARTS AGE [15:06:55] istio-system cluster-local-gateway-585f96dccc-dtknd 1/1 Running 0 2m7s [15:06:58] istio-system istio-ingressgateway-657b89d44d-wmqhg 1/1 Running 0 2m7s [15:07:01] istio-system istiod-68d4cb6c9-4759f 1/1 Running 0 2m14s [15:07:04] jayme: --^ \o/ \o/ \o/ [15:13:17] wheee, dancing time! [15:13:22] great [15:19:11] now knative :D [15:20:21] see you next year, then :) [15:26:13] I hope not! I have an interesting problem though, namely try to add a TLS cert into knative's helm chart [15:26:31] since knative's net-istio config is the one that configures the istio gateway pod [15:26:53] (so we'll have a inference.wikimedia.org VIP, and TLS termination on the ingress pod) [15:28:42] I'm not sure I understand. But I'm also a bit tired already so I'll probably read that again tomorrow :) [15:31:26] sure I can try to explain tomorrow if you have time, but basically knative is the one that creates the Istio Gateway CRD config [15:32:02] configuring the istio ingress [15:32:32] istioctl's operator specs are very limited, afaics only configuring the base setting for the ingress gw [15:32:38] (like L4 settings) [15:33:35] (see my comment in https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/699380/11/charts/knative-serving/templates/net_istio.yaml) [15:45:03] 10serviceops, 10Anti-Harassment, 10SRE, 10Traffic: Add IP Info (ASN & Geolocation) to requests to MediaWiki - https://phabricator.wikimedia.org/T251933 (10Niharika) [15:51:34] apergos, I see. so, this was at a coffee shop ... and I accepted the connection to submit my patches to gerrit .. but back home, i don't see host key change warnings. [15:51:55] um [15:51:59] same laptop, subbu?? [15:52:04] yes. [15:52:07] fuck [15:52:10] might be mitm [15:52:12] well that sucks [15:52:39] don't use that coffee shop to push gerrit patches again I guess :-/ [15:52:50] arlo was wondering if the coffee shop was filtering connections on port 22. [15:52:59] huh [15:53:13] right, i won't in the future. [15:53:42] what a drag [15:54:00] at least problem-solved we know it wasn't the host keys :-/ [15:54:20] yup. [18:13:25] 10serviceops, 10Technical-blog-posts, 10Datacenter-Switchover: Story idea for Blog: June 2021 DC Switchover - https://phabricator.wikimedia.org/T286080 (10srodlund) @Legoktm and @wkandek Thank you for this (and your patience while I was out on vacation). I will move this over to the blog this week to format... [18:49:26] 10serviceops, 10MW-on-K8s, 10SRE, 10Release-Engineering-Team (Radar): The restricted/mediawiki-webserver image should include skins and resources - https://phabricator.wikimedia.org/T285232 (10mmodell) >>! In T285232#7199879, @Joe wrote: > I am starting to think that we should just mount a volume from the...