[07:29:17] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443593 (10ops-monitoring-bot) depool host wikikube-worker[2014,2017].codfw.wmnet by jelto@cumin1002 with reason: Reimage node to bookworm + containerd [07:33:09] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443595 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 depool for host wikikube-worker[2014,2017].codfw.w... [07:34:57] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443598 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2014.codfw.wmnet with OS book... [07:34:58] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443599 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2017.codfw.wmnet with OS book... [08:15:10] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443647 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2014.codfw.wmnet with OS bookworm... [08:17:03] 06serviceops, 06collaboration-services, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes: Update Kubernetes clusters to >1.25 - https://phabricator.wikimedia.org/T341984#10443648 (10JMeybohm) [08:21:57] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443698 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2017.codfw.wmnet with OS bookworm... [08:22:19] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443699 (10ops-monitoring-bot) pool host wikikube-worker2017.codfw.wmnet by jelto@cumin1002 with reason: None [08:22:20] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443700 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker2017.codfw.wmnet comp... [08:22:23] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443701 (10ops-monitoring-bot) pool host wikikube-worker2014.codfw.wmnet by jelto@cumin1002 with reason: None [08:22:27] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443702 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker2014.codfw.wmnet comp... [08:24:34] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443711 (10ops-monitoring-bot) depool host wikikube-worker[2011-2013].codfw.wmnet by jelto@cumin1002 with reason: Reimage node to bookworm + containerd [08:26:22] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443713 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 depool for host wikikube-worker[2011-2013].codfw.w... [08:31:32] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443715 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2011.codfw.wmnet with OS book... [08:32:02] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443716 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2013.codfw.wmnet with OS book... [08:32:03] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443717 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2012.codfw.wmnet with OS book... [09:12:28] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443761 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2011.codfw.wmnet with OS bookworm... [09:15:33] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443764 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2012.codfw.wmnet with OS bookworm... [09:19:28] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443774 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2013.codfw.wmnet with OS bookworm... [09:19:50] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443775 (10ops-monitoring-bot) pool host wikikube-worker2011.codfw.wmnet by jelto@cumin1002 with reason: None [09:19:55] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443776 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker2011.codfw.wmnet comp... [09:19:58] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443777 (10ops-monitoring-bot) pool host wikikube-worker2012.codfw.wmnet by jelto@cumin1002 with reason: None [09:20:01] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443778 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker2012.codfw.wmnet comp... [09:20:04] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443779 (10ops-monitoring-bot) pool host wikikube-worker2013.codfw.wmnet by jelto@cumin1002 with reason: None [09:20:07] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443780 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker2013.codfw.wmnet comp... [10:11:25] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443912 (10ops-monitoring-bot) depool host kubernetes[2053,2056,2058].codfw.wmnet by jelto@cumin1002 with reason: Renaming nodes [10:13:08] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10443917 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 depool for host kubernetes[2... [11:23:23] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10444031 (10hnowlan) [11:24:09] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444032 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1002 for host wikikube-worker2022.codfw.wmnet with OS boo... [11:30:04] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444053 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1002 for host wikikube-worker2022.codfw.wmnet with OS bookwor... [12:13:58] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444118 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2053 to wikikube-worker2192 completed: - ku... [12:25:18] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444130 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2056 to wikikube-worker2193 completed: - ku... [12:31:03] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444158 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jelto@cumin1002 from kubernetes2058 to wikikube-worker2194 completed: - ku... [12:33:01] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 07Video: Clean up remaining videoscaling and jobrunner components - https://phabricator.wikimedia.org/T383317 (10hnowlan) 03NEW [12:33:41] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: Port videoscaling to kubernetes - https://phabricator.wikimedia.org/T355292#10444173 (10hnowlan) 05Open→03Resolved a:03hnowlan [12:34:13] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444178 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2192.codfw.wmnet with OS book... [12:37:37] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444181 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2193.codfw.wmnet with OS book... [12:40:43] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444192 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2194.codfw.wmnet with OS book... [12:41:54] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 07Video: Clean up remaining videoscaling and jobrunner components - https://phabricator.wikimedia.org/T383317#10444207 (10MoritzMuehlenhoff) [13:04:10] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10444250 (10akosiaris) [13:04:40] 06serviceops, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Reclaim jobrunner hardware for k8s - https://phabricator.wikimedia.org/T354791#10444253 (10akosiaris) I 've updated the list of servers to mark out some that are to be decommissioned, namely the ones in {T383226} [13:21:54] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444330 (10ops-monitoring-bot) pool host wikikube-worker[1088-1092].eqiad.wmnet by kamila@cumin1002 with reason: None [13:21:56] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444331 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by kamila@cumin1002 pool for host wikikube-worker[1088-1092].eqiad.wmnet completed: - w... [13:26:27] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T383213#10444364 (10kamila) [13:58:37] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444460 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2192.codfw.wmnet with OS bookworm... [14:00:22] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444467 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2193.codfw.wmnet with OS bookworm... [14:05:42] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444487 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2194.codfw.wmnet with OS bookworm... [14:11:19] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444526 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2193.codfw.wmnet with OS book... [14:12:12] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444530 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2194.codfw.wmnet with OS book... [14:39:58] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet - https://phabricator.wikimedia.org/T381878#10444634 (10Jelto) >>! In T381878#10441783, @Jclark-ctr wrote: > @Jelto i performed flea power drain and look... [14:41:28] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444644 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jelto@cumin1002 for host wikikube-worker2192.codfw.wmnet with OS book... [14:47:43] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet - https://phabricator.wikimedia.org/T381878#10444684 (10Jclark-ctr) @Jelto i am going to start flea power draining them and reimaging them wanted to try... [14:48:02] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1081.eqiad.wmnet - https://phabricator.wikimedia.org/T381878#10444685 (10Jclark-ctr) 05Open→03Resolved [14:52:37] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444703 (10ops-monitoring-bot) depool host mw[1457-1459].eqiad.wmnet by kamila@cumin1002 with reason: Renaming nodes [14:52:53] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444704 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2193.codfw.wmnet with OS bookworm... [14:54:26] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444709 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by kamila@cumin1002 depool for host mw[1457-1459].eqiad.wmnet com... [14:56:58] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444719 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2194.codfw.wmnet with OS bookworm... [14:58:25] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet - https://phabricator.wikimedia.org/T381789#10444725 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host w... [15:10:58] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444803 (10Jelto) [15:15:15] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet - https://phabricator.wikimedia.org/T381789#10444826 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host w... [15:16:46] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444832 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1458 to wikikube-worker1094 completed: - mw145... [15:18:42] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444839 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1459 to wikikube-worker1095 completed: - mw145... [15:20:06] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444841 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jelto@cumin1002 for host wikikube-worker2192.codfw.wmnet with OS bookworm... [15:21:18] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444845 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by kamila@cumin1002 from mw1457 to wikikube-worker1093 completed: - mw145... [15:23:28] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444855 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1093.eqiad.wmnet with OS... [15:23:40] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444856 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1094.eqiad.wmnet with OS... [15:24:00] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10444859 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1095.eqiad.wmnet with OS... [15:27:29] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444883 (10ops-monitoring-bot) pool host wikikube-worker[2193-2194].codfw.wmnet by jelto@cumin1002 with reason: None [15:27:31] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10444884 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by jelto@cumin1002 pool for host wikikube-worker[2193-2194].codfw.wmn... [15:28:25] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T383341 (10Jelto) 03NEW [15:31:36] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1069.eqiad.wmnet - https://phabricator.wikimedia.org/T381770#10444916 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host w... [15:39:49] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: roll-reimage cookbook should check that we are not reimaging all nodes in a taint group - https://phabricator.wikimedia.org/T383342 (10kamila) 03NEW [15:39:52] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet - https://phabricator.wikimedia.org/T381789#10444937 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host wikik... [15:40:21] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet - https://phabricator.wikimedia.org/T381789#10444954 (10Jclark-ctr) 05Open→03Resolved Reimaged passed with no issues [15:45:29] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Roll-reimage cookbook should lock on a per-node and per-taint group basis - https://phabricator.wikimedia.org/T383345 (10kamila) 03NEW [15:46:48] 06serviceops, 06collaboration-services, 06DC-Ops, 10ops-eqiad, and 3 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1243.eqiad.wmnet - https://phabricator.wikimedia.org/T383051#10445015 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for... [15:50:16] 06serviceops, 06All-and-every-Wikisource, 10Thumbor, 13Patch-For-Review, 07Wikimedia-Incident: Elevated 429 responses from Thumbor on codfw starting 2024-08-14 00:00 UTC - https://phabricator.wikimedia.org/T372470#10445020 (10hnowlan) 05Open→03Resolved Resolving this issue for now as there is fol... [15:56:14] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1073.eqiad.wmnet - https://phabricator.wikimedia.org/T381789#10445031 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host w... [15:57:00] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1057.eqiad.wmnet - https://phabricator.wikimedia.org/T381676#10445032 (10Jclark-ctr) 05Open→03Resolved Reimaged server without issues. it was posted onto T381789 ticket by mistake [16:10:19] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: roll-reimage cookbook could skip hosts already on the target OS - https://phabricator.wikimedia.org/T383346 (10kamila) 03NEW [16:11:31] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Cookbook to roll-reimage k8s nodes - https://phabricator.wikimedia.org/T377857#10445085 (10kamila) 05In progress→03Resolved Core functionality is merged 🎉 I have filed tasks for remaining fixes/improvements. [16:21:18] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1069.eqiad.wmnet - https://phabricator.wikimedia.org/T381770#10445122 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host wikik... [16:21:30] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1069.eqiad.wmnet - https://phabricator.wikimedia.org/T381770#10445123 (10Jclark-ctr) 05Open→03Resolved flea power drain and Reimaged server [16:21:32] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445126 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1093.eqiad.wmnet with OS bookworm executed with er... [16:21:33] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445127 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1094.eqiad.wmnet with OS bookworm executed with er... [16:21:35] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445128 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1095.eqiad.wmnet with OS bookworm executed with er... [16:25:30] 06serviceops, 06MW-Interfaces-Team, 10RESTBase Sunsetting, 10MW-1.44-notes (1.44.0-wmf.4; 2024-11-19), 07User-notice: Switchover plan from RESTbase to REST Gateway for rest_v1/page/html and rest_v1/page/title endpoints - https://phabricator.wikimedia.org/T374683#10445152 (10HCoplin-WMF) [16:33:16] 06serviceops, 06collaboration-services, 06DC-Ops, 10ops-eqiad, and 3 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1243.eqiad.wmnet - https://phabricator.wikimedia.org/T383051#10445203 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for hos... [16:33:40] 06serviceops, 06collaboration-services, 06DC-Ops, 10ops-eqiad, and 3 others: hw troubleshooting: "Comm Error: backplane 0" for wikikube-worker1243.eqiad.wmnet - https://phabricator.wikimedia.org/T383051#10445206 (10Jclark-ctr) 05Open→03Resolved Reimaged passed with no issues [16:55:52] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10445297 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1002 for host wikikube-worker2022.c... [17:11:54] 06serviceops, 10Observability-Logging, 10MW-1.44-notes (1.44.0-wmf.6; 2024-12-03): PHP Warning seen by logspam-watch but not by mediawiki-errors logstash page - https://phabricator.wikimedia.org/T382517#10445350 (10TheDJ) Cool. I have updated https://logstash.wikimedia.org/app/dashboards#/view/6f07d740-f... [17:14:09] tgr: would like to deploy a patch to the mediawiki apache config, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1109196 [17:14:48] would anyone be able to provide a more throughout review than me? [17:16:52] tgr can wait until early next week to deploy [17:21:00] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445386 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1093.eqiad.wmnet with OS bookworm [17:37:13] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445441 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1094.eqiad.wmnet with OS bookworm [17:37:41] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445444 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1095.eqiad.wmnet with OS bookworm [17:40:10] 06serviceops, 10Observability-Logging, 10MW-1.44-notes (1.44.0-wmf.6; 2024-12-03): PHP Warning seen by logspam-watch but not by mediawiki-errors logstash page - https://phabricator.wikimedia.org/T382517#10445447 (10Scott_French) Ah, thank you for doing so, @TheDJ! Also thanks for flagging those warnings... [17:41:17] 06serviceops, 06collaboration-services, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate wikikube-codfw to containerd - https://phabricator.wikimedia.org/T377877#10445460 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1002 for host wikikube-worker2022.codfw... [17:48:21] kamila_: 17:47:34 k8s main eqiad has >=255 nodes [17:48:23] 17:47:34 You cannot do this until https://phabricator.wikimedia.org/T375845 is fixed [17:48:25] 17:47:34 hieradata/common/kubernetes.yaml: FAILED [17:48:27] :O [17:48:32] yeah, just saw it [17:48:35] oops '^^ [17:48:52] hey it just saved us from much worse :) [17:53:00] indeed :D [17:55:44] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445492 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1093.eqiad.wmnet with OS boo... [17:55:48] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445493 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1094.eqiad.wmnet with OS boo... [17:55:57] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445494 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1095.eqiad.wmnet with OS boo... [17:58:25] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445510 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1093.eqiad.wmnet with OS... [17:58:27] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445511 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1094.eqiad.wmnet with OS... [17:58:35] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445512 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-worker1095.eqiad.wmnet with OS... [18:36:33] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445640 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1093.eqiad.wmnet with OS boo... [18:40:11] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445644 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1094.eqiad.wmnet with OS boo... [18:44:58] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10445651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-worker1095.eqiad.wmnet with OS boo... [19:18:20] Hi! We (MW Platform) would like to deploy SUL3 to testwiki next week, which involves setting up a new domain which is slightly different from the normal wiki domains. [19:18:28] T377187 is the task; T363695 has the backstory for why we need such a domain, but that's probably not relevant. The essence is, we'd need to deploy a puppet change that's fairly simple but not a cookie-cutter "add another wiki domain" change, and it would be great if someone could review it. [19:19:14] Would it be possible to get some reviews on https://gerrit.wikimedia.org/r/c/operations/puppet/+/1109196 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/1099339/3 in the next few days? Is it OK if I assign them for deployment to one if the MW infrastructure windows next week? [19:19:24] thanks in advance! [19:44:11] 06serviceops, 07Datacenter-Switchover: Steady-state sizing of mw-web and mw-api-ext - https://phabricator.wikimedia.org/T376519#10445804 (10Scott_French) Alright, it has been about an hour since https://gerrit.wikimedia.org/r/1078481 was applied, and everything continues to look fine. As expected, there's a mo... [21:53:13] 06serviceops, 10Observability-Logging, 06SRE, 10WMF-General-or-Unknown: Re-consider ` >/dev/null 2>&1` as output of many cron'd MW maintenance scripts - https://phabricator.wikimedia.org/T187078#10446147 (10andrea.denisse) I think that having a list of the MW maintenance scripts that have this behavior wou...