[00:11:05] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye executed with errors: - cp4047 (**FAIL**) - Downtimed on Ic... [00:11:41] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye [00:16:02] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye executed with errors: - cp4047 (**FAIL**) - Removed from Pu... [00:16:15] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye [00:24:34] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye executed with errors: - cp4047 (**FAIL**) - Removed from Pu... [00:25:09] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye [01:11:04] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4047.ulsfo.wmnet with OS bullseye completed: - cp4047 (**PASS**) - Removed from Puppet and Pu... [01:19:49] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [01:52:46] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10ssingh) >>! In T327812#8563361, @BCornwall wrote: > This is happening the first time I run the cookbooks on any of the newer servers. I've now adapte... [02:02:36] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10ssingh) I guess my theory about the incorrect/outdated firmwares was incorrect. On `cp4047`: ` Integrated Dell Remote Access Controller 5.10.30.00... [08:52:00] 10Traffic, 10netops, 10DBA, 10Data-Persistence, and 9 others: codfw row B switches upgrade - https://phabricator.wikimedia.org/T327991 (10Marostegui) [08:52:34] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10Marostegui) [09:49:00] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10Jelto) [09:54:44] 10Traffic, 10netops, 10DBA, 10Data-Persistence, and 9 others: codfw row B switches upgrade - https://phabricator.wikimedia.org/T327991 (10Jelto) [10:02:08] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10fgiunchedi) [10:05:00] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10MatthewVernon) [10:13:19] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10elukey) [10:16:27] 10Traffic, 10netops, 10DBA, 10Data-Persistence, and 9 others: codfw row B switches upgrade - https://phabricator.wikimedia.org/T327991 (10elukey) [10:18:56] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10fgiunchedi) [10:30:08] 10Traffic, 10netops, 10DBA, 10Data-Persistence, and 9 others: codfw row B switches upgrade - https://phabricator.wikimedia.org/T327991 (10fgiunchedi) [10:30:52] 10Traffic, 10netops, 10DBA, 10Data-Persistence, and 9 others: codfw row B switches upgrade - https://phabricator.wikimedia.org/T327991 (10fgiunchedi) [10:33:16] 10Traffic, 10netops, 10DBA, 10Data-Persistence, and 9 others: codfw row B switches upgrade - https://phabricator.wikimedia.org/T327991 (10MatthewVernon) [10:38:56] 10netops, 10Infrastructure-Foundations, 10SRE: Decom flowspec1001 - https://phabricator.wikimedia.org/T328009 (10ayounsi) 05Open→03Resolved [10:39:28] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: codfw: Relocate servers to make space for new switches in rowA and rowB - https://phabricator.wikimedia.org/T326564 (10ayounsi) [10:39:36] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10ayounsi) [10:40:04] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: codfw: Relocate servers to make space for new switches in rowA and rowB - https://phabricator.wikimedia.org/T326564 (10ayounsi) [14:27:38] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-drmrs: cr2-drmrs:xe-0/1/1 stuck optic - https://phabricator.wikimedia.org/T324555 (10RobH) 05Open→03In progress Neglected to do this earlier this week, I have the photos so I'll work on this today. [14:32:23] 10Traffic, 10Infrastructure-Foundations, 10SRE: Feature request: sre.hardware.upgrade-firmware should allow option to defer NIC firmware installation to next reboot - https://phabricator.wikimedia.org/T323717 (10jbond) >>! In T323717#8560745, @ssingh wrote: > ` > iDrac shouldn't upgrade to 6.00.00.00 (b... [14:34:30] 10Traffic, 10Infrastructure-Foundations, 10SRE: Feature request: sre.hardware.upgrade-firmware should allow option to defer NIC firmware installation to next reboot - https://phabricator.wikimedia.org/T323717 (10ssingh) >>! In T323717#8564660, @jbond wrote: >>>! In T323717#8560745, @ssingh wrote: > >> ` >>... [14:35:50] 10Traffic, 10Infrastructure-Foundations, 10SRE: Feature request: sre.hardware.upgrade-firmware should allow option to defer NIC firmware installation to next reboot - https://phabricator.wikimedia.org/T323717 (10jbond) Hi ssingh , i have just tested this by trying to upgrade the bios and the nic with only o... [14:36:46] 10Traffic, 10Infrastructure-Foundations, 10SRE: Feature request: sre.hardware.upgrade-firmware should allow option to defer NIC firmware installation to next reboot - https://phabricator.wikimedia.org/T323717 (10jbond) >>! In T323717#8564666, @ssingh wrote: >>>! In T323717#8564660, @jbond wrote: >>>>! In T32... [14:38:14] 10Traffic, 10Infrastructure-Foundations, 10SRE: Feature request: sre.hardware.upgrade-firmware should allow option to defer NIC firmware installation to next reboot - https://phabricator.wikimedia.org/T323717 (10ssingh) >>! In T323717#8564670, @jbond wrote: > Hi ssingh , > > i have just tested this by tryin... [14:43:31] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2027.codfw.wmnet with OS bullseye [14:53:28] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp4040.ulsfo.wmnet with OS bullseye [15:12:51] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: Move links to new MPC7E linecard - https://phabricator.wikimedia.org/T304712 (10Jclark-ctr) 05Open→03Resolved Netbox is updated [15:13:00] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad, 10Sustainability (Incident Followup): eqiad: upgrade row C and D uplinks from 4x10G to 1x40G - https://phabricator.wikimedia.org/T313463 (10Jclark-ctr) [15:22:25] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2027.codfw.wmnet with OS bullseye completed: - cp2027 (**PASS**) - Downtimed on Icinga/Alertm... [15:31:46] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [15:39:55] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp4040.ulsfo.wmnet with OS bullseye completed: - cp4040 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [15:41:55] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [17:16:03] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4048.ulsfo.wmnet with OS bullseye [17:28:37] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4048.ulsfo.wmnet with OS bullseye executed with errors: - cp4048 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [17:28:51] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4048.ulsfo.wmnet with OS bullseye [17:40:46] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) >>! In T327938#8560262, @ayounsi wrote: > * public1-a/b-codfw host might be better grouped in a single rack per row, providing still redundancy (4 racks per sit... [17:48:37] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) >>! In T327919#8560080, @ayounsi wrote: > @Papaul could you rename (Netbox, label, console, et... [18:12:29] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10ayounsi) > I've no particular preference, if I were doing it myself probably a slight one for the CWDM4/LC links, but happy to go with whatever the consensus/cheapest is... [18:15:05] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4048.ulsfo.wmnet with OS bullseye completed: - cp4048 (**PASS**) - Removed from Puppet and PuppetDB if present -... [18:24:56] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [18:25:12] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4041.ulsfo.wmnet with OS bullseye [18:37:10] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4041.ulsfo.wmnet with OS bullseye executed with errors: - cp4041 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [18:37:28] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4041.ulsfo.wmnet with OS bullseye [18:43:00] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10Papaul) @cmooney this looks good to me just one question. Is it possible to use xe-0/0/[46-47] for the... [19:28:17] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4041.ulsfo.wmnet with OS bullseye completed: - cp4041 (**PASS**) - Removed from Puppet and PuppetDB if present -... [19:32:44] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [19:33:03] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4049.ulsfo.wmnet with OS bullseye [19:38:23] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4049.ulsfo.wmnet with OS bullseye executed with errors: - cp4049 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [19:39:05] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4049.ulsfo.wmnet with OS bullseye [20:03:02] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4049.ulsfo.wmnet with OS bullseye executed with errors: - cp4049 (**FAIL**) - Removed from Puppet and PuppetDB if p... [20:06:04] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4049.ulsfo.wmnet with OS bullseye [20:56:25] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4049.ulsfo.wmnet with OS bullseye completed: - cp4049 (**WARN**) - Removed from Puppet and PuppetDB if present -... [21:51:04] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [21:51:26] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4042.ulsfo.wmnet with OS bullseye [21:59:24] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4042.ulsfo.wmnet with OS bullseye executed with errors: - cp4042 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [22:00:46] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4042.ulsfo.wmnet with OS bullseye [22:46:56] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4042.ulsfo.wmnet with OS bullseye completed: - cp4042 (**PASS**) - Removed from Puppet and PuppetDB if present -... [23:22:20] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [23:23:09] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4050.ulsfo.wmnet with OS bullseye [23:31:25] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp4050.ulsfo.wmnet with OS bullseye executed with errors: - cp4050 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [23:31:40] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp4050.ulsfo.wmnet with OS bullseye