[09:02:06] FYI, I'm temporarily switching kubestagemaster2005 to DRBD to move it around, the underlying Ganeti node is being updated to Bookworm [09:21:41] and completed [09:22:08] (completed as in back to plain disks) [09:24:40] 06serviceops: operations/docker-images/production-images contains references to non-existent image python3 - https://phabricator.wikimedia.org/T336682#10475380 (10Joe) 05Open→03Resolved >>! In T336682#10470306, @elukey wrote: > I think this task is completed :) Indeed. [10:25:38] hi i'm looking for a review of https://gitlab.wikimedia.org/repos/sre/hiddenparma/-/merge_requests/32 :P [11:03:23] 06serviceops, 10Scap: Retire use of scap proxies - https://phabricator.wikimedia.org/T384196 (10hnowlan) 03NEW [11:56:32] 06serviceops, 10Scap, 13Patch-For-Review: Retire use of scap proxies - https://phabricator.wikimedia.org/T384196#10475851 (10hnowlan) [13:07:44] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476009 (10ops-monitoring-bot) depool host mw[1464-1469].eqiad.wmnet by kamila@cumin1002 with reason: Renaming nodes [13:11:18] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476013 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by kamila@cumin1002 depool for host mw[1464-1469].eqiad.wmnet complet... [13:13:03] taavi: yay! I'll approve it once I check I'm allowed to :D [13:23:19] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476046 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1464 to wikikube-worker1117 completed: - mw1464 (*... [13:28:22] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476057 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1465 to wikikube-worker1118 completed: - mw1465 (*... [13:32:33] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476084 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1466 to wikikube-worker1119 completed: - mw1466 (*... [13:38:37] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476092 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1467 to wikikube-worker1120 completed: - mw1467 (*... [13:42:12] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476103 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1468 to wikikube-worker1121 completed: - mw1468 (*... [13:47:54] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476107 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1469 to wikikube-worker1122 completed: - mw1469 (*... [13:50:07] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476126 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1117.eqiad.wmnet with OS boo... [13:50:20] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476130 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1118.eqiad.wmnet with OS boo... [13:50:22] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476132 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1119.eqiad.wmnet with OS boo... [13:50:27] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476133 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1120.eqiad.wmnet with OS boo... [13:50:33] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476134 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1121.eqiad.wmnet with OS boo... [13:50:39] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476136 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1122.eqiad.wmnet with OS boo... [14:20:24] 06serviceops, 10docker-pkg, 06Release-Engineering-Team: Attach opencontainers image metadata to docker images - https://phabricator.wikimedia.org/T345070#10476287 (10elukey) @dduvall Hi! Thanks a lot for the long explanation, I am trying to get back to this task looking for anything actionable. IIUC Gitlab a... [14:27:46] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476369 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1117.eqiad.wmnet with OS bookwor... [14:30:52] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476402 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1121.eqiad.wmnet with OS bookwor... [14:33:39] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476419 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1118.eqiad.wmnet with OS bookwor... [14:40:23] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476469 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1122.eqiad.wmnet with OS bookwor... [14:41:13] Hey guys... Joe Biden here. I've decided to step down from the White House to focus on other projects. Billionaires are a threat to democracy, so check out https://BidenCash.st to put them in the bullseye. Keep an eye on the CNN inauguration for a promo code! [14:42:46] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476504 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1119.eqiad.wmnet with OS bookwor... [14:43:08] good advice [14:46:03] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10476533 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1120.eqiad.wmnet with OS bookwor... [14:55:11] hey folks! [14:55:22] https://alerts.wikimedia.org/?q=%40state%3Dactive&q=%40cluster%3Dwikimedia.org&q=alertname%3DPuppetPendingCertificateRequest - I see some Puppet CA certificate for mw nodes pending [14:55:32] that is a little strange, do you know why? [14:55:46] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T383620#10476595 (10kamila) [14:55:52] kamila_: ^^ [14:56:01] I'd assume fail-after-rename errors [14:56:34] https://phabricator.wikimedia.org/T383211 [14:58:34] hm, those are old names [14:59:46] I did not check the cookbook logs, but maybe reimaging those failed right after rename because they picked up the old name again [15:02:24] I believe those hosts don't exist, what would I need to do to get puppet to agree with me? just cleanup the nodes? [15:04:39] elukey: has taken care of the certificate requests last time we had this, he can probably share commands to run [15:06:19] (btw, they are indeed all ~recently renamed hosts) [15:10:16] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: decommission mw2282.codfw.wmnet - https://phabricator.wikimedia.org/T383965#10476703 (10Jelto) →14Duplicate dup:03T384226 [15:12:11] puppet node deactivate $node; puppet node clean $node [15:18:46] I usually do "puppet cert destroy $node" on puppetmaster1001 [15:23:31] that works too [15:24:08] depends on if there are still exported resources for those, which happens sometimes, I don't think puppet cert destroy would get rid of those [15:24:23] ack, thanks! [15:24:56] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Improve calico-typha firewall rules - https://phabricator.wikimedia.org/T365687#10476790 (10JMeybohm) Typha and calico-node are unable to hot reload changed certificates. For calico-node this is not that big of a problem as it reads the cer... [15:26:16] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Improve calico-typha firewall rules - https://phabricator.wikimedia.org/T365687#10476795 (10JMeybohm) [15:26:56] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Improve calico-typha firewall rules - https://phabricator.wikimedia.org/T365687#10476809 (10JMeybohm) a:05JMeybohm→03None [15:29:55] ah no wait I am stupid [15:30:10] the alert is for puppetSERVER, I thought it was related to the old puppet 5 CA [15:30:25] I got fooled while going into a meeting kamila_ [15:30:29] np, I figured [15:30:33] already done [15:31:10] what command did you run? [15:31:26] because on puppetserver I use "puppetserver ca clean --certname $node" [15:31:34] not the ones listed above [15:32:20] https://doc.wikimedia.org/spicerack/master/api/spicerack.puppet.html#spicerack.puppet.PuppetServer.delete :-P [15:33:04] that too [15:35:19] elukey: I just did the node deactivate and clean that c.laime suggested [15:35:19] the alerts are now cleared so we should be good [15:35:36] seemed like the thing to do given the hosts are gone (under those names) [15:35:46] okok perfect, good to know! Never used those on puppetserver [15:35:48] if that breaks something else, it should [15:36:05] thanks! [15:36:29] thank you for pointing out the alerts :-) [15:36:44] will keep an eye out with the rest of my renames and see if I can figure out what's causing them [15:48:00] that would be useful indeed [15:48:06] we could also look at the logs if needed [15:48:44] 06serviceops, 10MW-on-K8s, 10Observability-Logging: Unexpected utilization increase in udp_localhost-info kafka-logging topic - https://phabricator.wikimedia.org/T384233 (10fgiunchedi) 03NEW [15:48:46] were they re-added to debmonitor too? [15:49:06] I haven't checked [15:58:27] 06serviceops, 10API Platform, 10MediaWiki-extensions-ReadingLists, 06MW-Interfaces-Team, and 2 others: Reading List REST Interface: reroute calls - https://phabricator.wikimedia.org/T348493#10476955 (10hnowlan) Let us know whenever this work is okay to proceed. To proceed on the infra end of things we'll n... [16:21:03] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10477077 (10ops-monitoring-bot) pool host wikikube-worker[1117-1122].eqiad.wmnet by kamila@cumin1002 with reason: None [16:21:07] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-eqiad to containerd - https://phabricator.wikimedia.org/T377876#10477078 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by kamila@cumin1002 pool for host wikikube-worker[1117-1122].eqiad.wm... [16:27:28] 06serviceops, 10MW-on-K8s, 10Observability-Logging: Unexpected utilization increase in udp_localhost-info kafka-logging topic - https://phabricator.wikimedia.org/T384233#10477137 (10Clement_Goubert) I enabled logging for `mw-jobrunner` through rsyslog on the 13th https://gerrit.wikimedia.org/r/c/operations/d... [16:30:50] 06serviceops, 10MW-on-K8s, 10Observability-Logging: Unexpected utilization increase in udp_localhost-info kafka-logging topic - https://phabricator.wikimedia.org/T384233#10477165 (10Clement_Goubert) rsyslog container was added to mercurius on the 7th https://gerrit.wikimedia.org/r/c/operations/deployment-cha... [19:51:43] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-eqiad, 06SRE: decommission mw[1349-1413] - https://phabricator.wikimedia.org/T375842#10477776 (10VRiley-WMF) [20:35:17] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-eqiad, 06SRE: decommission mw[1349-1413] - https://phabricator.wikimedia.org/T375842#10477840 (10VRiley-WMF) [21:12:50] 06serviceops, 10API Platform, 10MediaWiki-extensions-ReadingLists, 06MW-Interfaces-Team, and 2 others: Reading List REST Interface: reroute calls - https://phabricator.wikimedia.org/T348493#10477874 (10BPirkle) Thanks @hnowlan . I started working on a task with the necessary details last week, but haven't...