[07:59:34] 10netops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 10netbox: Manage FR-tech 1:1 NAT rules using Netbox - https://phabricator.wikimedia.org/T429756 (10ayounsi) 03NEW p:05Triage→03Low [08:29:53] 06Traffic, 10Liberica, 10Prod-Kubernetes, 07Kubernetes, 06ServiceOps new (Next quarter): Add missing wikikube workers to conftool-data - https://phabricator.wikimedia.org/T420729#12039496 (10MLechvien-WMF) p:05Triage→03Medium [11:01:54] 06Traffic, 10Lift-Wing, 06ServiceOps new, 10ServiceOps-SharedInfra, 06Machine-Learning-Team (Q4 FY2025-26): Host Qwen 3.6-27B as an inference service - https://phabricator.wikimedia.org/T425680#12040084 (10gkyziridis) === Update === After the vllm image is updated and tested in the policy-violation model... [11:08:32] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773 (10BTullis) 03NEW [11:14:08] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12040123 (10BTullis) p:05Triage→03High [11:21:33] 06Traffic, 05Bot detection and mitigation (WE4.10 hCaptcha), 07Documentation, 06Product Safety and Integrity (Sprint Rose (Jun 15 - Jun 26)): hcaptcha proxy: update wikitech page - https://phabricator.wikimedia.org/T411131#12040144 (10kostajh) [11:25:17] 06Traffic, 07Documentation: hcaptcha proxy: update wikitech page - https://phabricator.wikimedia.org/T411131#12040156 (10kostajh) PSI side is done, @ssingh please have a look from the SRE point of view. [11:26:20] 06Traffic, 10hCaptcha, 06Product Safety and Integrity: Draft hCaptcha SLOs, document SLIs - https://phabricator.wikimedia.org/T411256#12040165 (10kostajh) [11:28:30] 10netops, 10Cloud-VPS, 06Infrastructure-Foundations, 06tools-infrastructure-team, 06cloud-services-team (FY2025/2026-Q3-Q4): Establish a blackbox network probe vantage point into cloud realm - https://phabricator.wikimedia.org/T429451#12040218 (10fgiunchedi) Ack, thank you both! I've added a meeting note... [13:13:56] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12040643 (10JMeybohm) /link {T375845} [13:24:04] 10netops, 06Infrastructure-Foundations, 06SRE: cr2-esams rpd failure after enabling bgp 'graceful-shutdown' (June 2026) - https://phabricator.wikimedia.org/T429386#12040676 (10Papaul) @cmooney thank you for the update [13:28:19] 10netops, 06Traffic, 10Cloud-VPS, 06Data-Platform-SRE, and 3 others: Plan to make clouddumps more resilient and easier to operate - https://phabricator.wikimedia.org/T411248#12040707 (10fgiunchedi) @brouberol @BTullis please let me know what you think of the load balancer plan and implementation in the des... [13:32:49] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12040751 (10BTullis) [14:11:45] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12040868 (10BTullis) Thanks @JMeybohm - So I see now from a bit more digging... [14:14:30] hello traffic team, i need to revert all these 5 changes, is it safe for me to do so or would someone prefer to chime in and do it instead? [14:14:30] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1283745/3 [14:14:30] the reason is: we don't need this anymore as we managed to make things work without exposing an additional port [14:29:55] 10netops, 10Cloud-VPS, 06Infrastructure-Foundations, 06tools-infrastructure-team, 06cloud-services-team (FY2025/2026-Q3-Q4): Establish a blackbox network probe vantage point into cloud realm - https://phabricator.wikimedia.org/T429451#12040954 (10LSobanski) p:05Triage→03Medium [14:37:05] 06Traffic, 06collaboration-services, 10GitLab, 06Release-Engineering-Team (Radar): gitlab behind CDN: serve gitlab.wm.o via text-lb instead of dedicated IPs? - https://phabricator.wikimedia.org/T428903#12041004 (10ABran-WMF) 05In progress→03Resolved a:03ABran-WMF This is essentially resolved, the... [15:30:58] 06Traffic, 06Data-Engineering: Tune refine webrequest data loss threshold to avoid noisy irrelevant alerts. - https://phabricator.wikimedia.org/T429809 (10Ottomata) 03NEW [15:42:26] 10netops, 06Data-Persistence, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: codfw: rack A6 maintenance - https://phabricator.wikimedia.org/T429812 (10ayounsi) 03NEW p:05Triage→03High [15:42:48] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: pod AB switches upgrade (2026) - https://phabricator.wikimedia.org/T426197#12041420 (10ayounsi) [15:52:34] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12041474 (10BTullis) a:03BTullis [15:54:18] 10netops, 06Data-Persistence, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: codfw: rack A6 maintenance - https://phabricator.wikimedia.org/T429812#12041495 (10jcrespo) ms-backup2003 -> I will take care of stopping vital network services before the window. [16:15:23] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: pod AB switches upgrade (2026) - https://phabricator.wikimedia.org/T426197#12041649 (10ayounsi) [16:25:25] 10netops, 06Data-Persistence, 06DC-Ops, 06Infrastructure-Foundations, and 2 others: codfw: rack A6 maintenance - https://phabricator.wikimedia.org/T429812#12041701 (10Eevans) No action is needed for the aqs nodes themselves. I //think// that the nightly ETL jobs that load data happen at ~2am UTC, which s... [17:13:57] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12041865 (10bking) Forgive the drive-by comment, but what do y'all think abou... [18:22:17] 06Traffic, 10DNS, 06SRE: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042062 (10Asaf) 05In progress→03Open Thank you for looking into this -- we now have the updated request from our vendor. It is: Please update the DNS configuration for the new ALB and remove the old ALB... [18:43:30] 06Traffic, 10DNS, 06SRE: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042124 (10Asaf) [18:43:33] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: pod AB switches upgrade (2026) - https://phabricator.wikimedia.org/T426197#12042127 (10Papaul) [19:02:51] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042186 (10BCornwall) @asaf Would you like to remove the stage ALB records as well? [19:08:20] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042204 (10BCornwall) @Asaf CNAME records are not allowed at the apex. [19:12:19] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042216 (10BCornwall) @Asaf one last question: Would you like all the associated ACM records removed? i.e. are you utilizing those certs for any other AWS services (e.g. CloudFront)? [19:26:08] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042251 (10Asaf) 1. No, please keep the stage ALB records for now. 2. Since learn.wiki was previously pointing to the old ALB, could you please update the existing apex record in the sam... [20:20:24] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042494 (10Asaf) Tagging @BCornwall , as this is time-sensitive before the AWS verification window closes and we'd have to start over. [20:36:31] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042591 (10BCornwall) @Asaf: IIRC the record names generated are deterministic so even if you were to delete/re-add them the requested records would be the same. Unfortunately, CNAMES c... [20:42:59] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042603 (10BCornwall) I also see that some IPs were hardcoded at the apex. This is not correct for AWS as the IPs for the ALBs could change at any time: DNS deployments pointing to ALBs... [21:01:23] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042709 (10Ahsan-arbisoft) @BCornwall 166.117.77.114 76.223.6.7 Please use these IPs for the apex domain, as they are the static IPs provided by AWS Global Accelerator and will not ch... [21:11:44] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042732 (10BCornwall) Are there any ACM validation records that need to be added? [21:12:56] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042735 (10Ahsan-arbisoft) NO [21:35:21] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042833 (10Ahsan-arbisoft) @BCornwall Have you also pointed *.learn.wiki` to the CNAME `a40059d1ee67a3468.awsglobalaccelerator.com? [21:48:02] 06Traffic, 10DNS, 06SRE, 13Patch-For-Review: new CNAME record for WikiLearn - https://phabricator.wikimedia.org/T429628#12042856 (10Ahsan-arbisoft) If ***.learn.wiki ** cannot be configured due to similar apex-related limitations. Could you please add the following domains individually and point them to th... [21:59:34] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12042867 (10cmooney) >>! In T429773#12041865, @bking wrote: > Forgive the dri... [22:27:44] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12042984 (10cmooney) > The IPv6 pool 2620:0:861:302::/64 is in the same state... [22:43:57] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12043055 (10BTullis) >>! In T429773#12042984, @cmooney wrote: >> The IPv6 poo... [23:01:22] 10netops, 06Infrastructure-Foundations, 06Data-Platform-SRE (2026-06-05 - 2026-06-26), 07Kubernetes: Calico IPv4/IPv6 block exhaustion on dse-k8s cluster, blocking new node provisioning - https://phabricator.wikimedia.org/T429773#12043077 (10BTullis) >>! In T429773#12042867, @cmooney wrote: > A new /20 sti...