[01:51:39] 10Wikimedia-Apache-configuration, 10DNS, 10SRE, 10Traffic-Icebox, 10Patch-For-Review: Remove aliases `minnan` and `zh-cfr` for the Min Nan Wikipedia - https://phabricator.wikimedia.org/T230382 (10Sotiale) Hi. Langcom is discussing this and is wondering how we can respond to the existing interwiki content... [01:55:15] 10Wikimedia-Apache-configuration, 10DNS, 10SRE, 10Traffic-Icebox, 10Patch-For-Review: Remove aliases `minnan` and `zh-cfr` for the Min Nan Wikipedia - https://phabricator.wikimedia.org/T230382 (10Ladsgroup) You can use global search: https://global-search.toolforge.org/?q=%22%5B%5B%3Aminnan%3A%22&namespa... [07:59:59] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [08:00:07] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539 (10ayounsi) [08:04:28] 10netops, 10Infrastructure-Foundations: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10ayounsi) p:05Triage→03Low [08:04:43] 10netops, 10Infrastructure-Foundations: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10ayounsi) [08:04:51] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539 (10ayounsi) [08:34:49] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [08:51:24] 10netops, 10Infrastructure-Foundations, 10SRE: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10Peachey88) [10:43:39] hnowlan: did you complete the lvs restarts? [10:46:56] vgutierrez: I didn't [10:47:06] + manage to get time yesterday - I would like to now if that suits though [10:50:27] sure [10:54:42] doing lvs2010 now [10:56:28] ack [10:57:00] looking good :) [10:57:22] phew! [10:57:41] secondary LVS in eqiad is lvs1020 [11:00:04] cool [11:00:36] restarting now [11:02:09] done. seems okay [11:02:42] yep [11:02:50] active LVS for thumbor in codfw is lvs2009 [11:02:57] lvs1019 in eqiad [11:05:44] cool :) [11:08:28] doing lvs2009 [11:09:06] done [11:10:24] looking good [11:12:00] cool, doing lvs1019 [11:12:52] done [11:13:57] thanks a lot vgutierrez! [11:14:43] cheers [12:15:10] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye [13:11:45] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye executed with errors: - cp4037 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [14:07:40] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye [14:30:07] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp4045.ulsfo.wmnet with OS bullseye [14:57:25] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp4037.ulsfo.wmnet with OS bullseye completed: - cp4037 (**PASS**) - Removed from Puppet and PuppetDB if present -... [14:59:35] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) 05Open→03In progress p:05Triage→03Medium [15:05:51] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [15:07:33] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2031.codfw.wmnet with OS bullseye [15:15:08] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp4045.ulsfo.wmnet with OS bullseye completed: - cp4045 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [15:18:27] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [15:21:39] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp4038.ulsfo.wmnet with OS bullseye [15:33:32] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2031.codfw.wmnet with OS bullseye executed with errors: - cp2031 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [15:33:55] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2031.codfw.wmnet with OS bullseye [15:38:30] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10cmooney) Folks I was considering doing these upgrades on the following dates: cloudsw1-c8-eqiad - Monday February... [15:41:13] 10netops, 10Infrastructure-Foundations: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10ayounsi) p:05Triage→03High [15:41:27] 10netops, 10Infrastructure-Foundations: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10ayounsi) [15:41:35] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [15:43:26] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2031.codfw.wmnet with OS bullseye executed with errors: - cp2031 (**FAIL**) - Removed from Puppet and PuppetDB if p... [15:45:56] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) The plan outlined in the task description LGTM. [15:46:15] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) [15:46:55] 10Traffic, 10netops, 10Data-Engineering, 10Data-Persistence, and 9 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10ayounsi) [15:47:45] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp4038.ulsfo.wmnet with OS bullseye executed with errors: - cp4038 (**FAIL**) - Downtimed on Icinga/Alertmanager -... [15:48:13] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp4038.ulsfo.wmnet with OS bullseye [15:53:09] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) [15:56:01] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10Marostegui) I'll check our db-related hosts and I'll get back to you tomorrow [16:02:52] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10fnegri) I think those dates are fine, cc @dcaro -- let's discuss the best way to reduce impact on Ceph (downtime,... [16:03:02] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Upgrade fasw to Junos 21 - https://phabricator.wikimedia.org/T316542 (10Papaul) [16:03:10] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539 (10Papaul) [16:03:44] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Upgrade fasw to Junos 21 - https://phabricator.wikimedia.org/T316542 (10Papaul) 05Open→03Resolved This is complete. [16:06:52] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Set consistent MTUs - https://phabricator.wikimedia.org/T315838 (10ayounsi) 05Open→03Resolved All done! [16:06:59] 10Traffic, 10SRE, 10Traffic-Icebox, 10WMF-General-or-Unknown, and 2 others: Pages whose title ends with semicolon (;) are intermittently inaccessible (likely due to ATS) - https://phabricator.wikimedia.org/T238285 (10BBlack) With the merge above, I think this issue is at least mitigated for now. It's not... [16:09:52] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp2031.codfw.wmnet with OS bullseye [16:24:16] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp6010.drmrs.wmnet with OS bullseye [16:32:54] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10cmooney) p:05Triage→03Low [16:33:22] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp4038.ulsfo.wmnet with OS bullseye completed: - cp4038 (**PASS**) - Removed from Puppet and PuppetDB if present -... [16:40:30] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10Volans) This task was brought to my attention by @ssingh today because `cp4037` did the same. It was reimaged first around `12:15` and it failed, and... [16:51:55] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp2031.codfw.wmnet with OS bullseye completed: - cp2031 (**PASS**) - Removed from Puppet and PuppetDB if present -... [16:54:07] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10ssingh) Thanks for the response @Volans! >>! In T327812#8558090, @Volans wrote: > This task was brought to my attention by @ssingh today because `cp... [16:58:04] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10Volans) >>! In T327812#8558155, @ssingh wrote: > I am surprised, so the above output is for cp4037? Because we certainly didn't reboot it and in any... [17:07:20] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10aborrero) I guess this was set up to mirror the eqiad setting. Since this VLAN as no room in the new network model (described [[ https://wikitech.wikimedia.org/wiki/Wiki... [17:08:00] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10aborrero) [17:08:23] 10netops, 10Infrastructure-Foundations, 10SRE: Automate EVPN switch underlay BGP neighbor peerings - https://phabricator.wikimedia.org/T327934 (10cmooney) p:05Triage→03Medium [17:09:31] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10aborrero) >>! In T316544#8557796, @cmooney wrote: > Folks I was considering doing these upgrades on the following... [17:10:39] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) [17:41:04] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: eqsin hosts are not rebooting when running sre.hosts.reimage cookbook - https://phabricator.wikimedia.org/T327812 (10ssingh) I just went through the logs now: ` Timestamp = 2023-01-25 14:07:50 Message = The server power action is initiated because the... [17:55:09] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) p:05Triage→03Medium [17:56:07] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [17:58:41] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [17:58:48] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp6010.drmrs.wmnet with OS bullseye completed: - cp6010 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [18:00:28] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10cmooney) Thanks for the feedback @aborrero. I'll plan on getting it decommissioned. [18:07:08] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [18:10:02] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [18:14:33] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [18:15:01] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp6002.drmrs.wmnet with OS bullseye [18:42:17] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10RKemper) [18:45:20] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10RKemper) [18:50:43] 10Traffic, 10netops, 10DBA, 10Data-Engineering, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10LSobanski) [19:00:34] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp6002.drmrs.wmnet with OS bullseye completed: - cp6002 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [19:02:40] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10cmooney) >>! In T316544#8558224, @aborrero wrote: >>>! In T316544#8557796, @cmooney wrote: >> Folks I was consider... [19:02:52] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10cmooney) [19:07:17] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [19:12:45] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp6011.drmrs.wmnet with OS bullseye [19:58:18] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp6011.drmrs.wmnet with OS bullseye completed: - cp6011 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [20:05:48] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [20:10:27] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp6003.drmrs.wmnet with OS bullseye [20:59:42] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp6003.drmrs.wmnet with OS bullseye completed: - cp6003 (**WARN**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [22:20:32] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Peachey88) [22:20:50] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [22:21:27] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp6012.drmrs.wmnet with OS bullseye [22:26:35] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [23:07:26] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp6012.drmrs.wmnet with OS bullseye completed: - cp6012 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [23:14:32] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [23:15:03] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin1001 for host cp6004.drmrs.wmnet with OS bullseye [23:57:16] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin1001 for host cp6004.drmrs.wmnet with OS bullseye completed: - cp6004 (**PASS**) - Downtimed on Icinga/Alertmanager - Disabled Pu... [23:58:20] 10Traffic, 10SRE: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall)