[00:21:25] 10serviceops: test pushing to phabricator repos over https - https://phabricator.wikimedia.org/T301889 (10Dzahn) 05Open→03Resolved Meanwhile this is done. Has been tested with multiple repos, docs have been created, some users have migrated to gitlab already etc. This link can be used to help users who wo... [00:21:34] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) [00:21:42] 10serviceops, 10Phabricator: test pushing to phabricator repos over https - https://phabricator.wikimedia.org/T301889 (10Dzahn) [00:22:43] 10serviceops, 10Phabricator: test pushing to phabricator repos over https - https://phabricator.wikimedia.org/T301889 (10Dzahn) [06:47:09] 10serviceops, 10Release-Engineering-Team, 10Scap: Deploy Scap version 4.4.1 - https://phabricator.wikimedia.org/T302464 (10Joe) 05Open→03In progress a:03Joe [06:47:15] 10serviceops, 10MW-on-K8s, 10Patch-For-Review, 10Release-Engineering-Team (Done by Feb 23🔥): Build MediaWiki images for kubernetes on the deployment servers - https://phabricator.wikimedia.org/T297673 (10Joe) [07:03:13] 10serviceops, 10Release-Engineering-Team, 10Scap: Deploy Scap version 4.4.1 - https://phabricator.wikimedia.org/T302464 (10Joe) I installed the new scap on restbase-dev, the mw canaries and the deployment servers. Our smoke tests seem all to work, we will update the rest of production later today. [07:26:12] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10elukey) All nodes have Bullseye and the new partition layout for overlay. I have also enabled overlay via puppet, and manually added the `systemd.unified_cgroup_hierarc... [09:24:20] 10serviceops, 10Add-Link, 10Growth-Team: MediaWiki should use service-proxy to connect to Add Link / Linkrecommendation - https://phabricator.wikimedia.org/T302719 (10JMeybohm) 05Open→03Resolved [09:34:59] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10JMeybohm) >>! In T302208#7743568, @elukey wrote: > The hosts seem ready to be added to the k8s codfw cluster, @JMeybohm lemme know how you want to proceed :) ❤️ From my... [10:12:55] 10serviceops, 10Add-Link, 10Growth-Team: Improved alerts/awareness if helm deployment of a service fails - https://phabricator.wikimedia.org/T302744 (10JMeybohm) > This is not uncommon with staging, sometimes it can take a minute or two. However the delay was long enough that my SSH connection cut out. I'd... [11:17:10] 10serviceops, 10Add-Link, 10Growth-Team: Improved alerts/awareness if helm deployment of a service fails - https://phabricator.wikimedia.org/T302744 (10JMeybohm) ` $ kubectl -n linkrecommendation get deployment,po NAME REA... [11:21:36] jayme: thanks for your help! [11:22:07] kostajh: sure! [11:27:46] 10serviceops, 10MediaWiki-General, 10SRE, 10Patch-For-Review, 10Service-Architecture: Create a service-to-service proxy for handling HTTP calls from services to other entities - https://phabricator.wikimedia.org/T244843 (10Joe) 05Open→03Resolved [11:28:22] <_joe_> jayme: ^^ aren't you a bit emotional? we finally removed the last unencrypted backend [11:29:12] _joe_: was about to start celebrating, but then I got paged :-p [11:29:22] <_joe_> damn icinga [11:29:24] eheh [11:29:25] <_joe_> it didn't reload [12:50:35] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10SRE, 10Maps (Geoshapes), and 2 others: New Service Request geoshapes - https://phabricator.wikimedia.org/T274388 (10MSantos) [12:51:57] 10serviceops, 10Product-Infrastructure-Team-Backlog, 10SRE, 10Maps (Geoshapes), and 2 others: New Service Request geoshapes - https://phabricator.wikimedia.org/T274388 (10MSantos) > Set up the traffic layer to send traffic to the service (if needed). This is a bit unclear to me currently. I am not sure fro... [13:19:03] 10serviceops, 10Add-Link, 10Growth-Team: Improved alerts/awareness if helm deployment of a service fails - https://phabricator.wikimedia.org/T302744 (10kostajh) 05Open→03Resolved a:03kostajh >>! In T302744#7744062, @JMeybohm wrote: > ` > $ kubectl -n linkrecommendation get deployment,po... [13:48:14] 10serviceops, 10GitLab (Infrastructure), 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Jelto) [13:53:27] 10serviceops, 10GitLab (Infrastructure), 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Jelto) SSH access to the test instance is not working because of different networking behavior on WMCS/VPS. The public floating IP ("service ip") is NATed to the... [14:02:56] 10serviceops, 10Scap, 10Patch-For-Review, 10Release-Engineering-Team (Doing): scap's canary check gives confusing logstash link - https://phabricator.wikimedia.org/T291870 (10hashar) 05Open→03Declined My idea was to use a semantic tag instead of a hardcoded list of hosts. Then revisiting this tasks a... [17:56:52] hi service ops! An organization reached out to me asking if they their requests to one of our APIs is being throttled, as their requests had apparently drastically gotten slower. I was wondering if that was an unintended (or intended) byproduct of the throttling done during the recent botnet attack. Anyone around to talk through that as a possibility? [18:09:26] nikkinikk: heyo, should be unrelated but we can confirm [18:10:35] do you have an IP address or UA string we can use to pull out logs? DM me please instead of posting them here, since it's a public channel [18:11:02] yep! will do [18:16:52] 10serviceops, 10GitLab (Infrastructure), 10Patch-For-Review: Migrate gitlab-test instance to puppet - https://phabricator.wikimedia.org/T297411 (10Majavah) [18:51:37] just for follow up, the org emailed back after Reuven suggested requesting more info about the latency and said their issue was resolved. no action needed 🙃 [20:12:38] 10serviceops, 10SRE, 10observability: aggregate mismatched wikiversions alert - https://phabricator.wikimedia.org/T302832 (10CDanis) [20:19:44] rzl, nikkinikk: could it be envoy & http 1.1 [20:37:55] RhinosF1: extremely unlikely [20:40:04] I think we turned off most of the edge-facing envoy already, if not all [20:40:24] (if you're talking about the recent ticket on http/1 stalls at the front edge) [20:41:29] bblack: yes [20:41:35] Has that gone now? [20:41:52] I know there was talk of it going at end of last week [20:42:07] yes, I just confirmed, our current operational state has both haproxy and ats-tls termination nodes, but no envoy nodes [20:42:14] Nice! [20:43:13] that doesn't necessarily or formally mean anything else beyond that, just a description of current operational state :) I don't mean to pre-empt other decisions/announcements or indicate that we're never turning envoy back on, etc. [22:24:47] 10serviceops, 10MediaWiki-extensions-PropertySuggester, 10Wikidata, 10wdwb-tech, 10Service-deployment-requests: New Service Request SchemaTree - https://phabricator.wikimedia.org/T301471 (10Michaelcochez) We'd be happy to receive a patch or pull request.