[07:59:57] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [08:00:05] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539 (10ayounsi) [08:04:28] 10netops, 10Infrastructure-Foundations: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10ayounsi) p:05Triage→03Low [08:04:43] 10netops, 10Infrastructure-Foundations: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10ayounsi) [08:04:49] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539 (10ayounsi) [08:34:47] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [08:51:22] 10netops, 10Infrastructure-Foundations, 10SRE: Use mgmt_junos on all network devices - https://phabricator.wikimedia.org/T327862 (10Peachey88) [11:20:58] currently the install servers are using public IP addresses. I think this is mostly due to the history of this role (back in the past it also hosted partly the apt repo, but these days it's limited to TFTP,DHCP/Squid/serving boot images via Nginx) [11:21:24] is anyone aware of some blocker or need for a public IP, otherwise I'd use the bullseye migration to move these to private IPs [11:24:13] moritzm: their public IPs are currently used in some config for allowing the cloudcumin hosts to connect to cloud VMs [11:24:19] how would squid work with a private IP? [11:27:11] ah, ok. I didn't know about the cloudcumin setup, then we'll stick with public IPs [11:32:22] moritzm: to be clear, that's not a blocker, just that if squid changes the IP seen by cloud I need to update some hiera [11:32:26] nothing complex [11:32:38] the question is how would squid work with private IPs? [11:33:04] but if only squid requires public IP we could move it to a different host [11:33:17] couple of VMs with public IPs would do it I guess [11:34:40] yeah squid would need to be moved to something else with public IPs in that case [11:34:57] and I guess if the rationale for moving install hosts to private is saving IPs then that doesn't change things [11:36:47] Squid allows hierarchy of proxies, so the edge proxies could pass via one central one, but this involves complexity and doesn't seem appealing either [11:37:49] I'll simply continue to use a public IP for these [11:38:37] yeah, probably not worth the added complexity, if there's no major drawback to leaving them on the publics [11:40:18] yeah, we could split squid to a dedicated VM for better separation of duties [11:40:31] but squid is like NAT in that case, it needs to have a public IP [11:41:32] for the list of servers that might be able to move to private IPs: https://phabricator.wikimedia.org/T317177 (see sub tasks) [11:41:39] APT is indeed one of them [12:33:39] jbond: what do you think of not using "profile::contact" for individuals? [12:34:15] looking at puppet it's pretty much only me who uses it :) [12:34:42] https://www.irccloud.com/pastebin/hUbnN4fG/ [12:35:46] XioNoX: yes im fine with dropping that [13:30:41] jbond, volans: https://gerrit.wikimedia.org/r/c/operations/puppet/+/883565 [13:55:00] when trying to create install4002.wikimedia.org, I'm getting a "Pynetbox.core.query.AllocationError: The requested allocation could not be fulfilled" [13:58:36] public1-ulsfo has no free addresses [14:00:50] indeed, either we can use the one reserved for infra [14:00:55] or decom bast4003 first [14:01:18] my preference goes to option #2 :) [14:01:25] https://netbox.wikimedia.org/ipam/prefixes/13/ip-addresses/ [14:02:43] ok, fair enough. I'd been planning to drop the old bastions in a few days anyway [14:03:32] I just didn't expect we were _that_ short of IP addresses :-) [14:03:52] yeah it got built quite small [14:03:59] and it's a pain to resize [14:04:13] next itteration of network design will solve that [14:04:14] like in drmrs [14:04:16] and I see why you created https://phabricator.wikimedia.org/T317177 :-) [14:07:56] hahah [14:08:03] it's for the DCs, so less of an issue [14:08:13] still good hygiene [14:59:33] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) 05Open→03In progress p:05Triage→03Medium [15:38:28] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10cmooney) Folks I was considering doing these upgrades on the following dates: cloudsw1-c8-eqiad - Monday February... [15:41:13] 10netops, 10Infrastructure-Foundations: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10ayounsi) p:05Triage→03High [15:41:27] 10netops, 10Infrastructure-Foundations: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10ayounsi) [15:41:33] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [15:45:54] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) The plan outlined in the task description LGTM. [15:46:13] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) [15:46:53] 10netops, 10Data-Engineering, 10Data-Persistence, 10Discovery-Search, and 9 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10ayounsi) [15:53:07] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) [15:55:59] 10netops, 10DBA, 10Data-Engineering, 10Data-Persistence, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10Marostegui) I'll check our db-related hosts and I'll get back to you tomorrow [16:02:50] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10fnegri) I think those dates are fine, cc @dcaro -- let's discuss the best way to reduce impact on Ceph (downtime,... [16:03:00] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Upgrade fasw to Junos 21 - https://phabricator.wikimedia.org/T316542 (10Papaul) [16:03:08] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade network devices to Junos 20+ - https://phabricator.wikimedia.org/T316539 (10Papaul) [16:03:42] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Upgrade fasw to Junos 21 - https://phabricator.wikimedia.org/T316542 (10Papaul) 05Open→03Resolved This is complete. [16:06:50] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Set consistent MTUs - https://phabricator.wikimedia.org/T315838 (10ayounsi) 05Open→03Resolved All done! [16:32:52] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10cmooney) p:05Triage→03Low [17:07:18] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10aborrero) I guess this was set up to mirror the eqiad setting. Since this VLAN as no room in the new network model (described [[ https://wikitech.wikimedia.org/wiki/Wiki... [17:07:58] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10aborrero) [17:08:22] 10netops, 10Infrastructure-Foundations, 10SRE: Automate EVPN switch underlay BGP neighbor peerings - https://phabricator.wikimedia.org/T327934 (10cmooney) p:05Triage→03Medium [17:09:29] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10aborrero) >>! In T316544#8557796, @cmooney wrote: > Folks I was considering doing these upgrades on the following... [17:10:37] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10aborrero) [17:55:07] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) p:05Triage→03Medium [17:56:05] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [17:58:40] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [18:00:26] 10netops, 10Infrastructure-Foundations, 10SRE: Is Vlan 2122 cloud-support1-b-codfw required? - https://phabricator.wikimedia.org/T327930 (10cmooney) Thanks for the feedback @aborrero. I'll plan on getting it decommissioned. [18:07:06] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [18:10:00] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [18:42:15] 10netops, 10DBA, 10Data-Engineering, 10Data-Persistence, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10RKemper) [18:45:18] 10netops, 10DBA, 10Data-Engineering, 10Data-Persistence, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10RKemper) [18:50:41] 10netops, 10DBA, 10Data-Engineering, 10Data-Persistence, and 10 others: codfw row A switches upgrade - https://phabricator.wikimedia.org/T327925 (10LSobanski) [18:53:02] \o I'm taking a look at Search team's hosts wrt the codfw row A switches: https://phabricator.wikimedia.org/T327925 [18:54:16] For the depool / repool actions needed, are those intended to be the actions taken by those handling the upgrade? i.e. for our hosts we'll want to handle getting the hosts depooled / banned from clusters in the days prior to the actual switch upgrade, so I left those table entries as `None` [18:54:57] does that make sense? [19:02:39] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10cmooney) >>! In T316544#8558224, @aborrero wrote: >>>! In T316544#8557796, @cmooney wrote: >> Folks I was consider... [19:02:50] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10cmooney) [19:04:04] ryankemper: hey, I reviewed this earlier with Arzhel but didn't spot this potential confusion [19:04:48] I think basically we are happy to take whatever depool actions are needed, if the action is straightforward to perform and the relevant team are happy for us [19:05:15] But we understand that for some things teams will want to do it themselves. [19:05:32] I think we wanted to have the name of the person who would do it there - but it seems a little unclear to me [19:05:52] I'll touch base with Arzhel tomorrow and we'll decide what's best and re-word the instructions [19:06:13] ack, sounds good! I'll circle back when that's been ironed out [19:06:23] yeah - could be what you have is perfect [19:06:41] ryankemper: I'll update you either way [22:20:30] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10Peachey88) [22:26:33] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney)