[08:08:54] hi folks! [08:09:15] I opened https://phabricator.wikimedia.org/T302701 to (re)discuss the ip subnets allocated for ml-serve clusters [08:09:31] after discovering a nice property of Knative (namely that it allocates a lot of svc ips) [08:09:54] let me know what you think later on (no rush) [10:34:25] 10serviceops, 10Prod-Kubernetes, 10Kubernetes: setup/install kubernetes10[18-22] - https://phabricator.wikimedia.org/T293728 (10elukey) [10:36:40] I have also created a patch for netboot.cfg to reimage the new k8s worker nodes with bullseye+overlay [10:36:49] lemme know if it makes sense, I can take care of the reimages [11:47:04] elukey: you know where kubernets1022 came from? https://phabricator.wikimedia.org/T290202 just has 1018-1021 [12:05:32] jayme: https://phabricator.wikimedia.org/T294301 [12:06:19] taavi: :D thanks [12:55:59] jayme: I was puzzled too, codfw had more workers and it seemed weird, then I found the single-node task [13:38:12] jayme: ok if I start reimaging the new k8s nodes? [13:38:17] starting from 2018 [13:40:18] (that was racked with https://phabricator.wikimedia.org/T294299) [13:41:45] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10elukey) [13:50:35] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1001 for host kubernetes2018.codfw.wmnet with OS bullseye [13:50:57] (started) [14:08:09] elukey: sorry, was out for lunch and sun. No objections regarding reimage! [14:13:38] perfect, I'll reimage all the nodes then [14:13:44] aftewards we'll decide what to do [14:18:39] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host kubernetes2018.codfw.wmnet with OS bullseye completed: - kubernetes2018 (**PASS**) - Downtimed... [14:20:38] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1001 for host kubernetes2019.codfw.wmnet with OS bullseye [14:23:33] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Implement POC for istio ingress - https://phabricator.wikimedia.org/T290966 (10JMeybohm) [14:45:32] 10serviceops, 10Add-Link, 10Growth-Team, 10Prod-Kubernetes, 10Kubernetes: Use ingress for linkrecommendation - https://phabricator.wikimedia.org/T302717 (10JMeybohm) [14:48:33] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host kubernetes2019.codfw.wmnet with OS bullseye completed: - kubernetes2019 (**PASS**) - Downtimed... [14:50:13] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1001 for host kubernetes2020.codfw.wmnet with OS bullseye [14:51:51] 10serviceops, 10Add-Link, 10Growth-Team, 10Prod-Kubernetes, 10Kubernetes: Use ingress for linkrecommendation - https://phabricator.wikimedia.org/T302717 (10kostajh) @JMeybohm is this something you'd like the #growth-team to work on, or is it something that #serviceops will do? [14:54:12] 10serviceops, 10Add-Link, 10Growth-Team, 10Prod-Kubernetes, 10Kubernetes: Use ingress for linkrecommendation - https://phabricator.wikimedia.org/T302717 (10JMeybohm) >>! In T302717#7741579, @kostajh wrote: > @JMeybohm is this something you'd like the #growth-team to work on, or is it something that #serv... [15:03:27] 10serviceops, 10Add-Link, 10Growth-Team: MediaWiki should use service-proxy to connect to Add Link / Linkrecommendation - https://phabricator.wikimedia.org/T302719 (10JMeybohm) [15:04:09] 10serviceops, 10Add-Link, 10Growth-Team, 10Prod-Kubernetes, 10Kubernetes: Use ingress for linkrecommendation - https://phabricator.wikimedia.org/T302717 (10JMeybohm) [15:04:15] 10serviceops, 10Add-Link, 10Growth-Team: MediaWiki should use service-proxy to connect to Add Link / Linkrecommendation - https://phabricator.wikimedia.org/T302719 (10JMeybohm) [15:05:49] 10serviceops, 10Add-Link, 10Growth-Team: MediaWiki should use service-proxy to connect to Add Link / Linkrecommendation - https://phabricator.wikimedia.org/T302719 (10JMeybohm) [15:18:29] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host kubernetes2020.codfw.wmnet with OS bullseye completed: - kubernetes2020 (**FAIL**) - Downtimed... [15:18:31] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host kubernetes2020.codfw.wmnet with OS bullseye executed with errors: - kubernetes2020 (**FAIL**)... [15:22:24] 10serviceops, 10Add-Link, 10Growth-Team, 10Patch-For-Review: MediaWiki should use service-proxy to connect to Add Link / Linkrecommendation - https://phabricator.wikimedia.org/T302719 (10JMeybohm) a:03JMeybohm [15:22:30] 10serviceops, 10Add-Link, 10Growth-Team, 10Prod-Kubernetes, 10Kubernetes: Use ingress for linkrecommendation - https://phabricator.wikimedia.org/T302717 (10JMeybohm) a:03JMeybohm [15:50:30] 10serviceops, 10Prod-Kubernetes: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1001 for host kubernetes2021.codfw.wmnet with OS bullseye [16:17:59] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host kubernetes2021.codfw.wmnet with OS bullseye completed: - kubernetes2021 (*... [16:54:10] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1001 for host kubernetes2022.codfw.wmnet with OS bullseye [17:21:53] 10serviceops, 10Prod-Kubernetes, 10Patch-For-Review: setup/install kubernetes20[1(89)|2(012)] - https://phabricator.wikimedia.org/T302208 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1001 for host kubernetes2022.codfw.wmnet with OS bullseye completed: - kubernetes2022 (*... [18:02:26] 10serviceops, 10Release-Engineering-Team, 10Scap: Deploy Scap version 4.4.0 - https://phabricator.wikimedia.org/T302464 (10dancy) 05Open→03Stalled Holding for updates. [18:02:29] 10serviceops, 10MW-on-K8s, 10Patch-For-Review, 10Release-Engineering-Team (Done by Feb 23🔥): Build MediaWiki images for kubernetes on the deployment servers - https://phabricator.wikimedia.org/T297673 (10dancy) [18:12:51] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) [18:34:00] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) @Galahad Hi! I have found https://phabricator.wikimedia.org/source/wikisp-mw-config/ which you use for 'Configuration of production Medi... [18:36:04] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) >>! In T296022#7742329, @Stashbot wrote: > {nav icon=file, name=Mentioned in SAL (#wikimedia-operations), href=https://sal.toolforge.org... [18:36:38] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Galahad) >>! In T296022#7742362, @Dzahn wrote: > @Galahad Hi! I have found https://phabricator.wikimedia.org/source/wikisp-mw-config/ which you... [18:41:24] 10serviceops, 10MW-on-K8s, 10Performance-Team, 10SRE, 10WikimediaDebug: Ensure WikimediaDebug "log" and "profile" features work with k8s-mwdebug - https://phabricator.wikimedia.org/T288164 (10Krinkle) For the record, the logs from k8s-mwdebug pods do show up in Logstash but not on the `mwdebug` dashboard... [18:42:05] 10serviceops, 10MW-on-K8s, 10SRE, 10SRE Observability, 10Patch-For-Review: Make logging work for mediawiki in k8s - https://phabricator.wikimedia.org/T288851 (10Krinkle) >>! In T288164#7742387, @Krinkle wrote: > For the record, the logs from k8s-mwdebug pods do show up in Logstash but not on the mwdebug... [19:00:46] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) Hi @Galahad ah, yes, I saw that ticket too but now realizing that's the same user, you :). So.. we can split this into 2 steps. For righ... [19:04:12] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Galahad) >>! In T296022#7742410, @Dzahn wrote: > Hi @Galahad ah, yes, I saw that ticket too but now realizing that's the same user, you :). So.... [19:20:33] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) >>! In T296022#7742412, @Galahad wrote: > If you can import the repo right now, I'll be happy about that! I can but .. it would have m... [19:26:07] 10serviceops, 10Release-Engineering-Team, 10Scap: Deploy Scap version 4.4.1 - https://phabricator.wikimedia.org/T302464 (10dancy) 05Stalled→03Open [19:26:10] 10serviceops, 10MW-on-K8s, 10Patch-For-Review, 10Release-Engineering-Team (Done by Feb 23🔥): Build MediaWiki images for kubernetes on the deployment servers - https://phabricator.wikimedia.org/T297673 (10dancy) [19:26:20] 10serviceops, 10Release-Engineering-Team, 10Scap: Deploy Scap version 4.4.1 - https://phabricator.wikimedia.org/T302464 (10dancy) 4.4.1 ready. [19:40:28] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) @Galahad Succesfully imported under https://gitlab.wikimedia.org/repos/wikisp-infrastructure as https://gitlab.wikimedia.org/repos/wikis... [20:10:34] 10serviceops, 10Add-Link, 10Growth-Team: Improved alerts/awareness if helm deployment of a service fails - https://phabricator.wikimedia.org/T302744 (10kostajh) [20:34:02] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) [20:35:21] 10serviceops, 10Phabricator, 10Release-Engineering-Team (Next): Deprecate git-ssh service on phabricator.wikimedia.org - https://phabricator.wikimedia.org/T296022 (10Dzahn) [20:47:32] 10serviceops, 10Add-Link, 10Growth-Team: Improved alerts/awareness if helm deployment of a service fails - https://phabricator.wikimedia.org/T302744 (10kostajh) I don't seem to be able to deploy a new version of the service now, on staging I see: ` COMBINED OUTPUT: WARNING: Kubernetes configuration file i... [21:25:50] 10serviceops, 10Add-Link, 10Growth-Team, 10Prod-Kubernetes, 10Kubernetes: Use ingress for linkrecommendation - https://phabricator.wikimedia.org/T302717 (10kostajh) >>! In T302717#7741586, @JMeybohm wrote: >>>! In T302717#7741579, @kostajh wrote: >> @JMeybohm is this something you'd like the #growth-team...