[06:59:38] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: Link down between cr3-ulsfo and cr4-ulsfo - https://phabricator.wikimedia.org/T390731#10798932 (10ayounsi) 05Resolved→03Open Unfortunately we're not out of the wood yet... `cr3-ulsfo> show interfaces et-0/0/0 media` still shows lo... [07:51:54] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: Link down between cr3-ulsfo and cr4-ulsfo - https://phabricator.wikimedia.org/T390731#10799037 (10ayounsi) 05Open→03Resolved After chatting with Cathal, we decided to leave it as it as moving ports requires intrusive changes (P... [08:03:39] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw: codfw: setup MPC10E-10C and SCBE3 - https://phabricator.wikimedia.org/T393552 (10ayounsi) 03NEW [08:03:49] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw: codfw: setup MPC10E-10C and SCBE3 - https://phabricator.wikimedia.org/T393552#10799104 (10ayounsi) [08:04:34] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 23.4R2 - https://phabricator.wikimedia.org/T364092#10799106 (10ayounsi) [08:05:20] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade core routers to Junos 23.4R2 - https://phabricator.wikimedia.org/T364092#10799110 (10ayounsi) [08:21:01] 06Traffic, 06Content-Transform-Team, 10RESTBase, 10RESTBase Sunsetting, 06serviceops: Block external traffic to RESTBase /page/data-parsoid endpoint and investigate internal usage - https://phabricator.wikimedia.org/T393557 (10MSantos) 03NEW [09:44:37] 06Traffic, 10conftool: FY 24/25 WE 4.3.11 Define a policy for maintenance of requestctl rules - https://phabricator.wikimedia.org/T393381#10799453 (10Joe) p:05Triage→03High I've decided to implement a command in requestctl to enforce the above rules. It's part of a larger MR that will cause a schema change... [09:53:30] o/ I'd like to roll out a change to expose more wikis to PCS/mobileapps without restbase in the chain. This doesn't need a slow rollout or anything as we have a fairly good idea of what's going to happen but just wanted to give a heads-up https://gerrit.wikimedia.org/r/c/operations/puppet/+/1138692 [12:09:09] FIRING: [8x] LVSHighCPU: The host lvs5005:9100 has at least its CPU 0 saturated - https://bit.ly/wmf-lvscpu - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs5005 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [12:13:05] ack hnowlan [12:14:09] RESOLVED: [8x] LVSHighCPU: The host lvs5005:9100 has at least its CPU 0 saturated - https://bit.ly/wmf-lvscpu - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs5005 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [14:23:31] 06Traffic: Improving the time it takes to run authdns-update - https://phabricator.wikimedia.org/T393602 (10ssingh) 03NEW [14:23:35] 06Traffic: Improving the time it takes to run authdns-update - https://phabricator.wikimedia.org/T393602#10800591 (10ssingh) p:05Triage→03Medium [14:24:38] 06Traffic, 06SRE, 13Patch-For-Review: Create provisioning and post-provisioning checks for Traffic hosts to confirm validity of varying hardware configurations - https://phabricator.wikimedia.org/T378724#10800593 (10CDobbins) a:05CDobbins→03ssingh [15:11:51] 06Traffic: Improving the time it takes to run authdns-update - https://phabricator.wikimedia.org/T393602#10800833 (10ssingh) So we have trimmed it down even with a simple `git maintenance run`: ` real 1m2.831s user 0m0.703s sys 0m0.263s ` This is definitely some progress but we should continue looking. [19:00:07] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: lvs3009 NIC HW issue (Broadcom, eno12399np0) - https://phabricator.wikimedia.org/T393616#10802011 (10RobH) a:03RobH I'll open a case with Dell, which will inevitably require the firmware on the NIC, mainboard, and idrac be updated before they... [19:13:09] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: lvs3009 NIC HW issue (Broadcom, eno12399np0) - https://phabricator.wikimedia.org/T393616#10802052 (10RobH) I can see it seems to have randomly fired a few times: ` Mon Mar 17 2025 13:32:01 A fatal error was detected on a component at bus 4 de... [19:21:19] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: lvs3009 NIC HW issue (Broadcom, eno12399np0) - https://phabricator.wikimedia.org/T393616#10802084 (10RobH) Support request confirmed as 'after hours english support' so I had to fill out my contact details a second time and request the upload u... [20:24:02] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE, 13Patch-For-Review: lvs3009 NIC HW issue (Broadcom, eno12399np0) - https://phabricator.wikimedia.org/T393616#10802334 (10ssingh) >>! In T393616#10802011, @RobH wrote: > I'll open a case with Dell, which will inevitably require the firmware on the NIC, mainboard, a... [20:49:56] 06Traffic, 06DC-Ops, 10ops-esams, 06SRE: lvs3009 NIC HW issue (Broadcom, eno12399np0) - https://phabricator.wikimedia.org/T393616#10802443 (10ssingh) The host has been depooled so you can reboot or shut it down without checking with us. Thanks for the quick response Rob! [23:11:57] 06Traffic: Update libvmod-netmapper to 1.10 - https://phabricator.wikimedia.org/T392533#10802814 (10BCornwall)