[01:16:57] 06serviceops, 10MW-on-K8s: Allow members of restricted to run maintenance scripts - https://phabricator.wikimedia.org/T378429#10502746 (10RLazarus) a:05RLazarus→03JMeybohm Assigning as discussed in the serviceops meeting. Thanks! [08:42:15] 06serviceops, 07sre-alert-triage: Alert in need of triage: SystemdUnitFailed (instance cumin1002:9100) - https://phabricator.wikimedia.org/T384999 (10LSobanski) 03NEW [08:47:03] 06serviceops, 07sre-alert-triage: Alert in need of triage: SystemdUnitFailed (instance cumin1002:9100) - https://phabricator.wikimedia.org/T384999#10503094 (10JMeybohm) [08:47:06] 06serviceops, 10Abstract Wikipedia team (25Q3 (Jan–Mar)): wikifunction httpbb tests fail because of title case issue - https://phabricator.wikimedia.org/T383032#10503097 (10JMeybohm) →14Duplicate dup:03T384999 [10:06:12] 06serviceops, 10Tool-schedule-deployment: Extend functionality to support MediaWiki infrastructure Windows and related repos - https://phabricator.wikimedia.org/T385007 (10jijiki) 03NEW [10:06:25] 06serviceops, 10Tool-schedule-deployment: Extend functionality to support MediaWiki infrastructure Windows and related repos - https://phabricator.wikimedia.org/T385007#10503290 (10jijiki) [10:08:00] 06serviceops, 10Tool-schedule-deployment: Extend functionality to support MediaWiki infrastructure Windows and related repos - https://phabricator.wikimedia.org/T385007#10503313 (10jijiki) [10:51:35] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10503553 (10jijiki) [10:52:12] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10503555 (10Clement_Goubert) [10:53:03] 06serviceops: Mediawiki maint scripts using service proxied by the tls proxy might fail when running with mwscript-k8s - https://phabricator.wikimedia.org/T382398#10503560 (10brouberol) We (Data Platform SRE) encountered the same issue when attempting to deploy an envoy sidecar alongside each airflow task contai... [10:53:05] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10503561 (10Lucas_Werkmeister_WMDE) [10:54:51] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10503565 (10jijiki) [10:57:56] 06serviceops, 10MW-on-K8s: Identify low-criticality maintenance job to move to mwcron - https://phabricator.wikimedia.org/T377963#10503573 (10BTullis) [11:21:00] 06serviceops: Mediawiki maint scripts using service proxied by the tls proxy might fail when running with mwscript-k8s - https://phabricator.wikimedia.org/T382398#10503640 (10dcausse) [11:21:01] 06serviceops, 10MW-on-K8s: Allow running one-off scripts manually - https://phabricator.wikimedia.org/T341553#10503641 (10dcausse) [11:31:35] 06serviceops: Mediawiki maint scripts using service proxied by the tls proxy might fail when running with mwscript-k8s - https://phabricator.wikimedia.org/T382398#10503681 (10BTullis) >>! In T382398#10503560, @brouberol wrote: > I think that barring deploying envoy as a full-fledged [sidecar](https://kubernetes.... [11:42:42] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10503730 (10jijiki) **shellbox-constraints:** Fully migrated, nothing standing out so far **shellbox-video: ** 50% migrated, we are standing by for potential transcoding issues. In ca... [12:11:37] 06serviceops, 10Citoid, 06Editing-team, 10RESTBase Sunsetting, and 2 others: Switchover plan from restbase to api gateway for Citoid - https://phabricator.wikimedia.org/T361576#10503792 (10Mvolz) >>! In T361576#10488265, @gerritbot wrote: > Change #1113458 **merged** by jenkins-bot: > %%%[mediawiki/service... [12:14:49] 06serviceops, 06Release-Engineering-Team, 10Scap, 13Patch-For-Review: Retire use of scap proxies - https://phabricator.wikimedia.org/T384196#10503795 (10hnowlan) [13:10:10] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10503959 (10Clement_Goubert) Based on the calculation in my [[ https://docs.google.com/spreadsheets/d/18BokLsimZj-7XdQfTGLIP__11aDIJnbL0cqBNdLRXuY/edit?usp=sharing | balancing sheet... [13:11:43] 06serviceops, 07sre-alert-triage: Alert in need of triage: SystemdUnitFailed (instance cumin1002:9100) - https://phabricator.wikimedia.org/T384999#10503966 (10Clement_Goubert) →14Duplicate dup:03T383032 [13:11:45] 06serviceops, 10Abstract Wikipedia team (25Q3 (Jan–Mar)): wikifunction httpbb tests fail because of title case issue - https://phabricator.wikimedia.org/T383032#10503968 (10Clement_Goubert) [13:12:18] 06serviceops, 10Abstract Wikipedia team (25Q3 (Jan–Mar)): wikifunction httpbb tests fail because of title case issue - https://phabricator.wikimedia.org/T383032#10503973 (10Clement_Goubert) 05Duplicate→03Open [13:13:26] 06serviceops, 07sre-alert-triage: Alert in need of triage: SystemdUnitFailed (instance cumin1002:9100) - https://phabricator.wikimedia.org/T384999#10503978 (10Clement_Goubert) Doing the dupe the other way around as T383032 for #abstract_wikipedia_team has been triaged by them already. [13:47:49] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10504173 (10RobH) [13:48:27] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10504176 (10RobH) Copying over the explanation of hostname breakdown from the purchasing task. >>! In T382899#10503772, @Clement_Goubert wrote: > Updated list of hostnames because... [13:49:02] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10504188 (10RobH) [15:02:25] 06serviceops, 10Abstract Wikipedia team (25Q3 (Jan–Mar)): wikifunction httpbb tests fail because of title case issue - https://phabricator.wikimedia.org/T383032#10504786 (10Clement_Goubert) I'm going to silence this alert for two weeks, please update me if it is fixed before then so I can remove the downtime. [15:03:15] 06serviceops, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 2 others: Tracking List: Relocating servers to free up 10G switch space in codfw - https://phabricator.wikimedia.org/T383709#10504805 (10ops-monitoring-bot) depool host wikikube-worker[2095,2175,2186].codfw.wmnet by jayme@cumin1002 w... [15:05:47] 06serviceops, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 2 others: Tracking List: Relocating servers to free up 10G switch space in codfw - https://phabricator.wikimedia.org/T383709#10504823 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jayme@cumin1002 depoo... [15:07:39] 06serviceops, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 2 others: Tracking List: Relocating servers to free up 10G switch space in codfw - https://phabricator.wikimedia.org/T383709#10504828 (10JMeybohm) @Jhancock.wm wikikube-worker[2095,2175,2186].codfw.wmnet have been shut down, lmk when... [15:09:07] 06serviceops, 10Observability-Metrics, 10SRE Observability (FY2024/2025-Q4): Repeated library panels in Grafana showing only after refresh, not on first load - https://phabricator.wikimedia.org/T384831#10504837 (10lmata) [15:35:03] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Allow members of restricted to run maintenance scripts - https://phabricator.wikimedia.org/T378429#10505008 (10JMeybohm) We decided to go with option 3 as it seems the most intuitive and does not impose additional maintenance burden. There's a new sudoers rule c... [15:43:26] hi team, one of the side effects of migrating to liberica is that load balancers will use IPv6 instead of IPv4 to talk to etcd servers, do you have any concerns with that small change in behavior? [15:47:17] not sure if all the conf* hosts have IPv6 records. I think so, better double check. I 've double checked that nginx listens on both IPv4+IPv6 on port 4001. As well as that the firewall allows 4001 $DOMAIN_NETWORKS [15:47:30] so, it's probably going to not be even noticed [15:48:28] conf hosts have AAAA records [15:48:46] cool then. [15:48:57] if the process listen to it [15:49:01] I didn't chekc [15:49:30] I see the nginx proxy listen [::]:4001 [15:56:45] volans: backlog :P [15:57:59] lol, I have to admit I stopped at "not sure if all the conf* hosts have IPv6 records" [15:58:04] my bad [15:58:07] :) [16:47:19] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505334 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by hnowlan@cumin2002 from mw2410 to wikikube-worker2242 completed: - mw2410 (**PASS**) - ✔️ Down... [16:49:03] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505337 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin2002 for host wikikube-worker2242.codfw.wmnet with OS bookworm [16:52:09] 06serviceops, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 2 others: Tracking List: Relocating servers to free up 10G switch space in codfw - https://phabricator.wikimedia.org/T383709#10505352 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=c24ad8f7-3e57-4f83-8a1f-c507313... [16:57:05] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505376 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by hnowlan@cumin2002 from mw2411 to wikikube-worker2243 completed: - mw2411 (**PASS**) - ✔️ Down... [16:58:45] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505382 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by hnowlan@cumin2002 for host wikikube-worker2243.codfw.wmnet with OS bookworm [17:35:44] 06serviceops, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 2 others: Tracking List: Relocating servers to free up 10G switch space in codfw - https://phabricator.wikimedia.org/T383709#10505596 (10Jhancock.wm) [17:37:37] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505607 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin2002 for host wikikube-worker2242.codfw.wmnet with OS bookworm completed: - wikik... [17:46:12] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505639 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by hnowlan@cumin2002 for host wikikube-worker2243.codfw.wmnet with OS bookworm completed: - wikik... [17:50:58] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505666 (10ops-monitoring-bot) pool host wikikube-worker[2242-2243].codfw.wmnet by hnowlan@cumin2002 with reason: None [17:51:02] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505667 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by hnowlan@cumin2002 pool for host wikikube-worker[2242-2243].codfw.wmnet completed: - wik... [17:51:04] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T385078 (10hnowlan) 03NEW [18:10:53] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10505737 (10hnowlan) [20:58:49] 06serviceops, 13Patch-For-Review: Build php-uuid package, and add to WMF production and CI - https://phabricator.wikimedia.org/T373752#10506333 (10Jdforrester-WMF) @Scott_French: Any update on getting the PHP 7.4 php-uuid package pushed so we can resolve this? [22:08:21] 06serviceops, 06Release-Engineering-Team, 10Scap, 13Patch-For-Review: Retire use of scap proxies - https://phabricator.wikimedia.org/T384196#10506629 (10Reedy) ` [21:55:11] FYI this scap run did print an error: [21:55:12] 21:52:27 sudo -u mwdeploy -n -- /usr/bin/rsync -l deploym... [22:55:29] 06serviceops, 10Wikimedia-Apache-configuration, 13Patch-For-Review: Investigate restricting match pattern on /wiki RewriteRule - https://phabricator.wikimedia.org/T357595#10506787 (10RLazarus) 05Open→03Resolved