[06:27:55] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ayounsi) Indeed, looks about right :) For Puppet, if we can change the Hiera merge strategy to `hash` for `profile::bird::adve... [07:12:20] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, and 3 others: Upgrade cloudsw1-c8-eqiad and cloudsw1-d5-eqiad to Junos 20+ - https://phabricator.wikimedia.org/T316544 (10dcaro) [07:27:31] 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for anycast prefixes - https://phabricator.wikimedia.org/T347494 (10ayounsi) 05Open→03Resolved All done. [07:36:03] 10Traffic, 10SRE, 10Patch-For-Review: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work - https://phabricator.wikimedia.org/T347054 (10ayounsi) I was wondering what to do for all the appliances that have ntp.site.wikimedia.org configured. To me the best here is to... [09:12:31] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10jbond) Proposal looks good to me, minor nit would be to rename `ACAST_PS_ADVERTISE` to remove references to anycast to avoid con... [09:59:23] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Lucas_Werkmeister_WMDE) [10:15:30] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10cmooney) > Otherwise, it should be fairly straightforward: we add the VIP the same way we do for the anycast IPs, making sure to... [10:40:10] 10netops, 10Infrastructure-Foundations, 10SRE: Firewall filter blocking traceroute in underlay QFX5120 EVPN - https://phabricator.wikimedia.org/T348120 (10cmooney) p:05Triage→03Low [10:51:29] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, 10Release-Engineering-Team (Seen): Move 25% of mediawiki external requests to mw on k8s - https://phabricator.wikimedia.org/T348122 (10Clement_Goubert) [10:52:50] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Clement_Goubert) [11:00:46] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Server moves in codfw to support switch numbering scheme - https://phabricator.wikimedia.org/T348125 (10cmooney) p:05Triage→03Medium [11:08:27] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B server moves - port-block constraint / numbering - https://phabricator.wikimedia.org/T348125 (10cmooney) [11:26:00] 10netops, 10Infrastructure-Foundations, 10SRE: Create automation to move servers in Netbox from old to new switch - https://phabricator.wikimedia.org/T348129 (10cmooney) p:05Triage→03Medium [11:27:01] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B migration - non-standard device moves - https://phabricator.wikimedia.org/T348128 (10cmooney) [11:38:09] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ayounsi) `ACAST_PS_ADVERTISE` is hardcoded in [[ https://github.com/unixsurfer/anycast_healthchecker | anycast_healthchecker ]]... [12:11:26] 10netops, 10Infrastructure-Foundations, 10SRE: Firewall filter blocking traceroute in underlay QFX5120 EVPN - https://phabricator.wikimedia.org/T348120 (10ayounsi) Nice rabbit hole! I found this: https://www.reddit.com/r/Juniper/comments/g12qxh/the_right_way_to_allow_traceroute_in_re_filter/ So it's possible... [12:47:43] 10netops, 10Infrastructure-Foundations, 10SRE: cr2-esams:FPC0 Parity error - https://phabricator.wikimedia.org/T318783 (10cmooney) 05Open→03Resolved I am going to close this task, the FPC issue was addressed through card replacement (although we decom'd router in the meantime). Despite my best efforts i... [12:57:58] 10HTTPS, 10SRE, 10Traffic-Icebox, 10Upstream: Support ECH on Wikimedia servers - https://phabricator.wikimedia.org/T205378 (10Aklapper) @DennisJJackson Hi and welcome to Phabricator! What //in this ticket// led you to asking for "retriage" (and what does that mean)? [13:08:35] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10cmooney) p:05Triage→03Low [13:08:59] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10cmooney) [13:23:44] 10HTTPS, 10SRE, 10Traffic-Icebox, 10Upstream: Support ECH on Wikimedia servers - https://phabricator.wikimedia.org/T205378 (10DennisJJackson) @Aklapper - It looks like this issue was originally raised several years ago and put in the icebox. I'm flagging that the situation around standardization and deploy... [13:28:23] 10Traffic, 10SRE, 10Patch-For-Review: Repackage purged for bullseye and bookworm - https://phabricator.wikimedia.org/T347837 (10Fabfur) [13:34:12] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Change cloud-instance-transport vlan subnets from /30 to /29 - https://phabricator.wikimedia.org/T348140 (10cmooney) [13:38:06] 10HTTPS, 10SRE, 10Traffic-Icebox, 10Upstream: Support ECH on Wikimedia servers - https://phabricator.wikimedia.org/T205378 (10ssingh) Hi @DennisJJackson: Thanks for the question. We do plan to work on ECH and enable it for our sites and have had some discussions internally. There is no timeline yet as such... [13:40:50] 10netops, 10Infrastructure-Foundations, 10SRE: Create automation to move servers in Netbox from old to new switch - https://phabricator.wikimedia.org/T348129 (10Papaul) @cmooney this should be a complication if we did have a mixed of 1G and 10G servers within the same rack which is not the case. In all exist... [13:43:10] 10Traffic, 10SRE, 10Patch-For-Review: Simplify maintenance of DNS/NTP hosts to reduce toil around reboots, reimages, and other work - https://phabricator.wikimedia.org/T347054 (10ssingh) >>! In T347054#9223568, @ayounsi wrote: > I was wondering what to do for all the appliances that have ntp.site.wikimedia.o... [14:00:41] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ssingh) Thanks everyone for the discussion and feedback above! So it seems like two main points have come up above: 1. We can c... [14:01:03] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10jbond) >ACAST_PS_ADVERTISE is hardcoded in anycast_healthchecker (the tool we use to monitor services). in that case agree its t... [14:03:14] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ssingh) >>! In T348041#9224990, @jbond wrote: >>ACAST_PS_ADVERTISE is hardcoded in anycast_healthchecker (the tool we use to mon... [14:25:55] 10Traffic, 10SRE, 10Patch-For-Review: Deploy new purged version with UDS feature - https://phabricator.wikimedia.org/T347837 (10Fabfur) [14:36:47] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ayounsi) Ah right! My bad. Unrelated and maybe a scope creep, but we could also start by advertising a unicast v6 IP to validat... [14:59:21] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B server moves - port-block constraint / numbering - https://phabricator.wikimedia.org/T348125 (10cmooney) 05Open→03Resolved @papaul answered in T348129#9224878, seems like we're in a good place given previous rack assignment as '1... [14:59:29] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) [15:01:43] 10netops, 10Infrastructure-Foundations, 10SRE: Create automation to move servers in Netbox from old to new switch - https://phabricator.wikimedia.org/T348129 (10cmooney) >>! In T348129#9224878, @Papaul wrote: > @cmooney this should be a complication if we did have a mixed of 1G and 10G servers within the sam... [15:12:48] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10cmooney) >>! In T348041#9222035, @ssingh wrote: > We can and probably should have a backup static routes for each of `ns[01]` bu... [15:23:15] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10cmooney) >>! In T348041#9222035, @ssingh wrote: > We can and probably should have a backup static routes for each of `ns[01]` bu... [15:25:11] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ayounsi) Oops, I missed some of the comments. * I'm in favor of ditching the statics * Changing the Hiera merge strategy seems... [15:38:11] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate atlas-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348159 (10cmooney) p:05Triage→03Medium [15:38:31] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate atlas-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348159 (10cmooney) [15:38:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B migration - non-standard device moves - https://phabricator.wikimedia.org/T348128 (10cmooney) [15:39:39] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ssingh) >>! In T348041#9225321, @cmooney wrote: >>>! In T348041#9222035, @ssingh wrote: >> We can and probably should have a bac... [15:41:20] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ssingh) >>! In T348041#9225405, @ayounsi wrote: > Oops, I missed some of the comments. > > * I'm in favor of ditching the stati... [15:45:51] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10jbond) >>! In T348041#9225478, @ssingh wrote: >>>! In T348041#9225405, @ayounsi wrote: >> * Changing the Hiera merge strategy s... [15:46:56] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ssingh) For posterity: - no static routes - merge strategy Arzhel mentioned above - I am going to rename `ACAST_PS_ADVERTISE`... [15:59:22] 10netops, 10Infrastructure-Foundations, 10SRE: Create automation to move servers in Netbox from old to new switch - https://phabricator.wikimedia.org/T348129 (10Papaul) I am thinking about something to consider when going servers refresh or new servers [16:09:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate mr1-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348164 (10cmooney) p:05Triage→03Medium [16:11:13] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B migration - non-standard device moves - https://phabricator.wikimedia.org/T348128 (10cmooney) [16:11:22] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate mr1-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348164 (10cmooney) [16:22:55] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate atlas-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348159 (10ayounsi) Yeah, that's perfect. We can revisit the day it dies and needs to be migrated to a VM. [16:37:00] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate atlas-codfw from asw-a1-codfw to lsw1-a1-codfw - https://phabricator.wikimedia.org/T348159 (10cmooney) [17:00:43] 10Traffic, 10SRE, 10Patch-For-Review: Deploy new purged version with UDS feature - https://phabricator.wikimedia.org/T347837 (10Fabfur) [17:01:07] 10Traffic, 10SRE, 10Patch-For-Review: Deploy new purged version with UDS feature - https://phabricator.wikimedia.org/T347837 (10Fabfur) 05Open→03Resolved [17:24:39] 10Traffic, 10SRE: Rename ACAST_PS_ADVERTISE in bird and anycast-healthchecker to BIRD_IP_ADVERTISE - https://phabricator.wikimedia.org/T348174 (10ssingh) [17:24:51] 10Traffic, 10SRE: Rename ACAST_PS_ADVERTISE in bird and anycast-healthchecker to BIRD_IP_ADVERTISE - https://phabricator.wikimedia.org/T348174 (10ssingh) [17:25:00] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Remove static routes for ns[01] and replace their announcements with bird - https://phabricator.wikimedia.org/T348041 (10ssingh) [18:01:00] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) p:05Triage→03Medium [18:01:49] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Migrate lvs2011 and lvs2012 to new top-of-rack switches - https://phabricator.wikimedia.org/T348178 (10cmooney) [18:01:57] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Codfw row A-B migration - non-standard device moves - https://phabricator.wikimedia.org/T348128 (10cmooney) [19:07:02] 10Traffic, 10Data Products, 10SRE: Data Quality - requestctl not getting set - https://phabricator.wikimedia.org/T342577 (10VirginiaPoundstone) @Milimetric What is the status on this task?