[08:16:03] 06serviceops, 10MW-on-K8s, 10Observability-Metrics: Move mw-accesslog-metrics instance to k8s - https://phabricator.wikimedia.org/T371214 (10fgiunchedi) 03NEW [08:30:25] 06serviceops, 06Infrastructure-Foundations, 10Data-Platform-SRE (2024.07.08 - 2024.07.28), 13Patch-For-Review: Create a helm chart for the cloudnativepg postgresql operator - https://phabricator.wikimedia.org/T364797#10021741 (10brouberol) a:03brouberol [09:06:35] hi, as an heads up likely this week we'll be switching k8s logging to dedicated kafka topics, for all intents and purposes this change is transparent "end to end" nothing changes [09:06:39] https://phabricator.wikimedia.org/T366710 [09:06:41] cc claime ^ [09:09:41] ack [10:47:28] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, and 2 others: Spin down api_appserver and appserver clusters - https://phabricator.wikimedia.org/T367949#10022456 (10Volans) [11:04:52] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, and 2 others: Spin down api_appserver and appserver clusters - https://phabricator.wikimedia.org/T367949#10022490 (10Clement_Goubert) [11:12:42] 06serviceops, 10WMDE-TechWish-Maintenance, 07Epic, 10Maps (Kartotherian), 13Patch-For-Review: Move Kartotherian to Kubernetes - https://phabricator.wikimedia.org/T216826#10022498 (10Jgiannelos) @elukey Just a heads up,the maps node run 3 more things other than kartotherian that are essential for maps: *... [11:32:57] 06serviceops: deploy1003 implementation tracking - https://phabricator.wikimedia.org/T364417#10022531 (10akosiaris) 05Open→03Resolved [11:34:33] 06serviceops, 06Infrastructure-Foundations, 10Data-Platform-SRE (2024.07.08 - 2024.07.28), 13Patch-For-Review: Create a helm chart for the cloudnativepg postgresql operator - https://phabricator.wikimedia.org/T364797#10022548 (10akosiaris) No disagreement on my side, with a cursory reading, I am reaching t... [12:08:06] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#10022646 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2441 to wikikube-worker2039 completed: - mw2441 (**PASS... [12:08:44] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#10022647 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2039.codfw.wmnet with OS bullseye [12:53:52] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#10022771 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2039.codfw.wmnet with OS bullseye completed: - wikikube-wor... [14:01:36] 06serviceops, 10MediaWiki-Core-Profiler, 07Documentation, 13Patch-For-Review: Tideways_xhprof has been archived, migrate everything to xhprof - https://phabricator.wikimedia.org/T348379#10023050 (10Jdforrester-WMF) [14:02:00] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q1:rack/setup/install wikikube-worker1240 to wikikube-worker1304 - https://phabricator.wikimedia.org/T369743#10023061 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host wikikube-worker1240.eqiad.wmnet with OS bull... [14:09:30] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T371260 (10Clement_Goubert) 03NEW [14:09:35] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T371260#10023126 (10Clement_Goubert) p:05Triage→03Low [14:12:33] 06serviceops, 06SRE, 13Patch-For-Review: mw2420-mw2451 do have unnecessary raid controllers (configured) - https://phabricator.wikimedia.org/T358489#10023129 (10Clement_Goubert) p:05Triage→03Low [14:22:24] 06serviceops, 10decommission-hardware: decommission mw226[1-2].codfw.wmnet mw22[68-77].codfw.wmnet - https://phabricator.wikimedia.org/T371262 (10Clement_Goubert) 03NEW [14:24:44] 06serviceops, 10decommission-hardware: decommission mw226[1-2].codfw.wmnet mw22[68-77].codfw.wmnet - https://phabricator.wikimedia.org/T371262#10023212 (10Clement_Goubert) [14:34:47] 06serviceops, 10MW-on-K8s, 10Observability-Metrics: Move mw-accesslog-metrics instance to k8s - https://phabricator.wikimedia.org/T371214#10023274 (10kamila) a:03kamila [14:55:56] 06serviceops, 10Data-Platform-SRE (2024.07.08 - 2024.07.28), 13Patch-For-Review: Create a helm chart for the cloudnativepg postgresql operator - https://phabricator.wikimedia.org/T364797#10023436 (10joanna_borun) [15:04:51] 06serviceops, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10023460 (10Milimetric) Just for the record, we met and discussed @Joe's proposal (this task's description)... [15:18:39] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: Q1:rack/setup/install wikikube-worker1240 to wikikube-worker1304 - https://phabricator.wikimedia.org/T369743#10023511 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host wikikube-worker1240.eqiad.wmnet with OS bullseye... [15:25:01] 06serviceops, 07Datacenter-Switchover: Verify our current wikikube capacity (in both DCs) can handle all our traffic - https://phabricator.wikimedia.org/T371273 (10jijiki) 03NEW [15:26:14] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10GitLab (Integrations), and 2 others: Container image reports in debmonitor are broken - https://phabricator.wikimedia.org/T348876#10023573 (10elukey) To keep archives happy - the code is fully deployed on build2001, and I have raised t... [15:27:08] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, and 2 others: Spin down api_appserver and appserver clusters - https://phabricator.wikimedia.org/T367949#10023539 (10Clement_Goubert) 05In progress→03Resolved [15:42:06] 06serviceops, 06SRE, 13Patch-For-Review: mw2420-mw2451 do have unnecessary raid controllers (configured) - https://phabricator.wikimedia.org/T358489#10023645 (10Clement_Goubert) [16:15:52] 06serviceops, 10MW-on-K8s, 10wikitech.wikimedia.org: Migrate Wikitech to Kubernetes - https://phabricator.wikimedia.org/T292707#10023781 (10jijiki) [16:45:31] 06serviceops, 10Scap: Reimage deploy2002 as bullseye - https://phabricator.wikimedia.org/T371282 (10akosiaris) 03NEW [16:47:34] 06serviceops, 10decommission-hardware: decommission deploy1002 - https://phabricator.wikimedia.org/T371283#10024023 (10akosiaris) [17:14:06] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#10024160 (10jijiki) [17:14:07] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Migrate MW appservers' base images to bullseye - https://phabricator.wikimedia.org/T356293#10024161 (10jijiki) [17:17:58] 06serviceops, 10Citoid, 10VisualEditor, 10VisualEditor-MediaWiki-References, and 2 others: Register Citoid as a "friendly bot" (or alternatively verified bot) with Cloudflare - https://phabricator.wikimedia.org/T370118#10024166 (10ppelberg) a:03dchan [17:36:42] hello, heads up I put https://gerrit.wikimedia.org/r/c/operations/puppet/+/1052791: mediawiki.org - Rewrite /beacon/event -> EventLogging rest handler on the mw infra UTC late deployment cal for tomorrow july 30. https://wikitech.wikimedia.org/wiki/Deployments#Tuesday,_July_30 [17:36:51] I should be able to do this myself, but I will ping here for a heads up. [17:37:43] If there are no requests for the prior puppet request window, I may do this before that window, so I can make some meetings. [17:37:48] will ping here before I do. [18:05:40] 06serviceops, 10Charts, 10Shellbox, 06SRE: Figure out how a shellbox instance for the Chart extension would work - https://phabricator.wikimedia.org/T370739#10024506 (10akosiaris) >>! In T370739#10019839, @Catrope wrote: > @akosiaris I'm trying to figure out how we should proceed based on your comment. Y... [18:08:14] 06serviceops, 10Shellbox, 06SRE, 10Charts (Sprint 3): Figure out how a shellbox instance for the Chart extension would work - https://phabricator.wikimedia.org/T370739#10024527 (10LGoto) [18:09:47] 06serviceops, 10Shellbox, 06SRE, 10Charts (Sprint 3): Figure out how a shellbox instance for the Chart extension would work - https://phabricator.wikimedia.org/T370739#10024525 (10LGoto) p:05Triage→03High [19:28:39] 06serviceops, 10Shellbox, 06SRE, 10Charts (Sprint 3): Figure out how a shellbox instance for the Chart extension would work - https://phabricator.wikimedia.org/T370739#10024998 (10CDanis) >>! In T370739#10024506, @akosiaris wrote: > Rate limiting is broken in service-runner for a long time now. See T200374...