[07:45:07] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) [07:50:23] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) [08:23:32] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) [08:26:54] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) [08:29:02] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) [08:31:33] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) [08:52:41] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10MoritzMuehlenhoff) [08:54:01] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10elukey) [08:59:52] 10Traffic, 10DBA, 10Data Pipelines, 10Data-Engineering-Planning, and 9 others: eqiad row B switches upgrade - https://phabricator.wikimedia.org/T330165 (10Marostegui) [11:14:38] vgutierrez: hey, quick one, I see on the row A maintenance task it says dns1001 should be removed from authdns_servers prior to the work [11:14:51] (sorry to pick on you, timezone :)) [11:14:56] no problem :) [11:15:18] Is that a simple matter of commenting it out in puppet/hierdata and merging the patch? [11:15:50] indeed [11:16:28] cool, are you happy for me to do that then? [11:16:44] right now? :) [11:16:47] can it be done a good time before the work? or should we minimise the time it isn't in the pool? [11:17:01] we try to minimize the time [11:17:06] ok [11:17:52] there is already a patch for it https://gerrit.wikimedia.org/r/c/operations/puppet/+/894654 [11:18:11] ah super :) [11:19:00] ok so closer the time either I merge or one of you can, sounds simple enough [11:19:02] thanks! [11:25:02] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10aborrero) [11:25:42] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10aborrero) [11:26:07] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10aborrero) Sent a ping to @Marostegui regarding clouddb[1013-1014,1021] Also @Andrew regarding cloudservices host, but I think the host can be taken... [11:29:04] topranks: BTW... I've noticed that BGP / BFD isn't supper happy on dns1001 [11:29:19] hmm ok let me have a look [11:29:27] when you say "isn't super happy"..... ? [11:29:34] Mar 07 11:19:36 dns1001 bird[724]: bgp1: Received: Unknown error 6.9: 060a [11:29:34] Mar 07 11:20:12 dns1001 bird[724]: bfd1: Bad packet from 208.80.154.196 - unknown session id (1237539074) [11:29:39] happening since March 3rd at 15:14 [11:31:27] Indeed yep, session to CR1 is down [11:31:33] https://www.irccloud.com/pastebin/elYzxQQl/ [11:32:49] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Marostegui) @aborrero regarding clouddb* hosts, it is up to your team but I think it would be nice if you could depool them. Better user experience f... [11:33:48] Oddly the CR shows the BFD session as being up. [11:35:19] nothing outstanding on SAL involving dns1001 besides being reimaged to bullseye on March 2nd [11:35:33] but the other dns servers are happy and also running bullseye [11:37:14] yeah it's odd. one unfortunate element of the config we have here is we have a single "bfd" protocol defined for bird [11:37:21] it's fine, works ok [11:37:43] but means if I issue a "restart" it'll restart both the non-working one and the one (to CR2) that is up [11:37:53] which will likely be disruptive, if only brief :( [11:38:40] I'm looking at such options as I don't see an issue, CR reports the session as up which means BFD packets being sent and received [11:39:27] hmm I can restart bird on dns1001 and see what happens [11:40:01] let me restart bfd protocol only, it should be less disruptive than a full systemd unit restart [11:40:30] dns1001 is sending BFD packets to CR1, and has status 'up' in them... [11:40:30] https://phabricator.wikimedia.org/P45194 [11:41:30] ok BGP is back up now [11:41:43] I restarted BFD (both) and BGP (to CR1) [11:42:43] For ref how to do this from 'birdc' console: https://phabricator.wikimedia.org/P45195 [11:43:46] cheers [11:44:14] so I guess that sukhe failed to mention stopping bird.service on dns1001 as well [11:44:24] (on the eqiad row A task) [11:44:51] we take care of that before the maint. window [11:55:13] vgutierrez: and https://gerrit.wikimedia.org/r/c/operations/puppet/+/894654 too, which I will merge soon [11:55:45] topranks: these seem similar to the errors on doh1001 if you remember, we discussed them [11:55:59] nothing much came out of that too because we couldn't pinpoint what was wrong [11:59:01] sukhe: this one looks a little different to those occasions I think [11:59:04] similar too [11:59:16] but on those occasions the CR reported the session down also [11:59:41] whereas here the CR reported it up, but bird saying it was down, which is a little strange [11:59:58] It's supposed to be "bi-directional", so either both should be down or both up [12:00:19] it was designed to avoid a scenario where one side thought things were ok and the other not [12:00:29] so BFD was down but BGP was up? [12:01:10] No BGP was down, because Bird saw the BFD as being down, and thus rejected the CR's BGP packets [12:01:15] But oddly the CR saw the BFD as up [12:01:34] And bird was sending BFD packets saying "my session is up", definitely seems like a bug on the Bird side [12:01:50] o_O [12:02:05] 10Traffic, 10SRE, 10Wikidata, 10wdwb-tech: Wikidata seems to still be utilizing insecure HTTP URIs - https://phabricator.wikimedia.org/T331356 (10Bugreporter) See also: {T226453} {T153563} [12:04:26] sukhe: I think for now we should just keep an eye on it, see if it happens again, if it does we need to gather more data when it happens [12:04:43] ok yeah [12:05:52] thanks for filing the task, nevertheless [12:06:21] yeah that's just an improvement we could do regardless, could be useful [12:06:34] I'll see what Arzhel thinks when he is back next week [12:06:49] sukhe: you also answered the question I had for you I think, you will merge the patch to remove dns1001 before the network maintenance later? [12:07:05] topranks: yes! I will merge it shortly and have everything ready [12:07:13] I prepped the patch yesterday but plan to merge it a bit closer to the event [12:07:33] ok great! let me know when it's done or if there are any problems, feel free to do it as close to the time as you want :) [12:07:39] thanks [12:47:37] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10BTullis) [12:55:17] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10fnegri) @Marostegui @aborrero the patch above should depool clouddb1013 and clouddb1014. I don't think clouddb1021 can be depooled easily as it look... [12:57:43] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10BTullis) [12:58:03] topranks: all done from Traffic :) [12:58:15] sukhe: great thanks for confirming ! [13:46:33] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10cmooney) [13:52:47] 10Traffic, 10MediaWiki-File-management, 10SRE, 10MW-1.40-notes (1.40.0-wmf.27; 2023-03-13), and 2 others: Remove IEContentAnalyzer - https://phabricator.wikimedia.org/T309787 (10Jdforrester-WMF) 05In progress→03Resolved [13:56:54] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10BTullis) [13:58:38] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10jbond) [13:59:13] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10jbond) [13:59:49] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10MatthewVernon) [14:10:02] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=f4ffc353-a529-4620-994f-ae7b737f3c7a) set by cmooney@cumin1001 for 2:00:00 on 238 ho... [14:17:13] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 10 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=0a07bba2-0f50-4eec-9718-0c768add34f3) set by cmooney@cumin1001 for 2:00:00 on 1 host... [14:42:06] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 10 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10jbond) [14:50:22] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 10 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10cmooney) Happy to say the upgrade went as expected, no issues encountered. All devices now back online running 21.4R3-S1.5. [14:52:35] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 10 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10Andrew) the following hosts paged during this maintenance: ` NodeDown wmcs cloudvirt1023:9100 (node eqiad) NodeDown wmcs cloudvirt1024:9100 (node e... [14:54:41] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 10 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10MoritzMuehlenhoff) [14:59:00] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 10 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10BTullis) [15:07:03] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum1001.eqiad.wmnet with OS bullseye [15:10:48] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10cmooney) >>! In T329073#8672931, @Andrew wrote: > the following hosts paged during this maintenance: > > > ` > NodeDown wmcs cloudvirt1023:9100 (no... [15:13:26] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10MoritzMuehlenhoff) [15:19:27] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10cmooney) [15:25:37] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [15:26:34] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir5001.eqsin.wmnet with OS bullseye [15:27:54] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [15:31:08] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum1001.eqiad.wmnet with OS bullseye completed: - durum1001 (**PASS**) - Downtimed on Icinga... [15:31:19] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10BTullis) [15:35:16] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Papaul) [15:37:07] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum1002.eqiad.wmnet with OS bullseye [15:38:42] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Papaul) [15:50:44] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Papaul) [15:51:58] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Papaul) @cmooney I update the table with lengths between all the racks. [16:04:25] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum1002.eqiad.wmnet with OS bullseye completed: - durum1002 (**PASS**) - Downtimed on Icinga... [16:06:40] 10Traffic, 10DBA, 10Data-Engineering-Planning, 10Data-Persistence, and 11 others: eqiad row A switches upgrade - https://phabricator.wikimedia.org/T329073 (10colewhite) [16:07:00] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Papaul) [16:22:04] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir5001.eqsin.wmnet with OS bullseye completed: - ncredir5001 (**PASS**) - Downtimed on Ic... [16:24:21] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [16:24:26] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [16:25:20] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum2001.codfw.wmnet with OS bullseye [16:28:00] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [16:52:31] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum2001.codfw.wmnet with OS bullseye completed: - durum2001 (**PASS**) - Downtimed on Icinga... [16:53:27] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [16:53:58] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum2002.codfw.wmnet with OS bullseye [16:57:12] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum3001.esams.wmnet with OS bullseye [17:22:15] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir5002.eqsin.wmnet with OS bullseye [17:23:54] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum2002.codfw.wmnet with OS bullseye completed: - durum2002 (**PASS**) - Downtimed on Icinga... [17:31:22] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum3001.esams.wmnet with OS bullseye completed: - durum3001 (**PASS**) - Downtimed on Icinga... [17:40:07] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [17:40:22] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [17:40:37] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum3002.esams.wmnet with OS bullseye [18:16:26] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir5002.eqsin.wmnet with OS bullseye completed: - ncredir5002 (**PASS**) - Downtimed on Ic... [18:17:34] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [18:18:05] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum3002.esams.wmnet with OS bullseye completed: - durum3002 (**PASS**) - Downtimed on Icinga... [18:19:24] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum5001.eqsin.wmnet with OS bullseye [18:20:36] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum6001.drmrs.wmnet with OS bullseye [18:39:48] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir6001.drmrs.wmnet with OS bullseye [18:48:28] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum4001.ulsfo.wmnet with OS bullseye completed: - durum4001 (**PASS**) - Downtimed on Icinga... [18:51:38] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [18:56:26] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum6001.drmrs.wmnet with OS bullseye executed with errors: - durum6001 (**FAIL**) - Downtime... [18:59:57] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Run 2x1G links from asw-b1-codfw to cloudsw1-b1-codfw - https://phabricator.wikimedia.org/T331470 (10cmooney) p:05Triage→03Low [19:00:24] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Run 2x1G links from asw-b1-codfw to cloudsw1-b1-codfw - https://phabricator.wikimedia.org/T331470 (10cmooney) [19:00:32] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) [19:01:46] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum5001.eqsin.wmnet with OS bullseye completed: - durum5001 (**PASS**) - Downtimed on Icinga... [19:03:11] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Run 2x1G links from asw-b1-codfw to cloudsw1-b1-codfw - https://phabricator.wikimedia.org/T331470 (10cmooney) [19:03:28] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [19:03:38] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [19:04:20] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum6001.drmrs.wmnet with OS bullseye [19:06:29] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum4002.ulsfo.wmnet with OS bullseye [19:06:41] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum5002.eqsin.wmnet with OS bullseye [19:21:32] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir6001.drmrs.wmnet with OS bullseye executed with errors: - ncredir6001 (**FAIL**) - Down... [19:30:25] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Run 2x1G links from asw-b1-codfw to cloudsw1-b1-codfw - https://phabricator.wikimedia.org/T331470 (10Peachey88) [19:36:03] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum4002.ulsfo.wmnet with OS bullseye completed: - durum4002 (**PASS**) - Downtimed on Icinga... [19:36:16] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum6001.drmrs.wmnet with OS bullseye executed with errors: - durum6001 (**FAIL**) - Downtime... [19:38:40] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [19:40:49] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir6002.drmrs.wmnet with OS bullseye [19:51:37] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum5002.eqsin.wmnet with OS bullseye completed: - durum5002 (**PASS**) - Downtimed on Icinga... [20:20:09] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir6002.drmrs.wmnet with OS bullseye executed with errors: - ncredir6002 (**FAIL**) - Down... [20:22:43] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [20:35:45] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir3001.esams.wmnet with OS bullseye [21:03:23] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir4001.ulsfo.wmnet with OS bullseye [21:15:58] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir3001.esams.wmnet with OS bullseye completed: - ncredir3001 (**PASS**) - Downtimed on Ic... [21:17:20] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [21:17:45] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir3002.esams.wmnet with OS bullseye [21:24:37] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ssingh) [21:27:33] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by sukhe@cumin2002 for host durum6002.drmrs.wmnet with OS bullseye [21:37:38] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir4001.ulsfo.wmnet with OS bullseye completed: - ncredir4001 (**PASS**) - Downtimed on Ic... [21:38:28] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [21:39:13] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir4002.ulsfo.wmnet with OS bullseye [21:54:45] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir3002.esams.wmnet with OS bullseye completed: - ncredir3002 (**PASS**) - Downtimed on Ic... [21:55:27] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [21:56:34] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir2001.codfw.wmnet with OS bullseye [21:59:25] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by sukhe@cumin2002 for host durum6002.drmrs.wmnet with OS bullseye executed with errors: - durum6002 (**FAIL**) - Downtime... [22:13:21] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir4002.ulsfo.wmnet with OS bullseye completed: - ncredir4002 (**PASS**) - Downtimed on Ic... [22:15:09] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [22:15:16] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir1001.eqiad.wmnet with OS bullseye [22:21:59] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Run 2x1G links from asw-b1-codfw to cloudsw1-b1-codfw - https://phabricator.wikimedia.org/T331470 (10Papaul) 05Open→03Resolved a:03Papaul @Jhancock.wm thank you we can resolve this task [22:22:06] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10Papaul) [22:23:30] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir2001.codfw.wmnet with OS bullseye completed: - ncredir2001 (**PASS**) - Downtimed on Ic... [22:26:11] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [22:26:46] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir2002.codfw.wmnet with OS bullseye [22:44:33] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir1001.eqiad.wmnet with OS bullseye completed: - ncredir1001 (**PASS**) - Downtimed on Ic... [22:54:45] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage started by brett@cumin2002 for host ncredir2002.codfw.wmnet with OS bullseye completed: - ncredir2002 (**PASS**) - Downtimed on Ic... [23:31:33] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [23:32:16] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10BCornwall) [23:32:56] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bullseye - https://phabricator.wikimedia.org/T321309 (10ops-monitoring-bot) Cookbook cookbooks.sre.ganeti.reimage was started by brett@cumin2002 for host ncredir1002.eqiad.wmnet with OS bullseye