[06:30:51] 06Traffic, 10API Platform, 10MediaWiki-User-login-and-signup: Login with `action=login` and bot password does not create a JWT session cookie - https://phabricator.wikimedia.org/T415007 (10Joe) 03NEW [06:33:04] 06Traffic, 06SRE, 13Patch-For-Review: Offer AuthDNS service over IPv6 - https://phabricator.wikimedia.org/T81605#11535605 (10ayounsi) A maybe safer alternative is to first enable IPv6 BGP peering between the network and all the dnsbox with `profile::bird::do_ipv6: true` (and the Homer patches). BGP over v6 w... [08:04:38] 06Traffic, 06Commons: HTTP 429 error on original image requests on Commons (iOS app by default hiding the Referrer header) - https://phabricator.wikimedia.org/T413570#11535724 (10Nylki) Hi @Jonesey95 ! Do you have a link to the commit or ticket (if it's public)? I am unfortunately still experiencing the descr... [09:33:17] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11535957 (10cmooney) Yeah I was worried we'd see the same pattern as the graph in the task description... [09:57:12] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Servers exposing incorrect LLDP info - https://phabricator.wikimedia.org/T250367#11536035 (10ayounsi) With that cookbook change merged, new Dell servers (or any that we use the provision cookbook on) will have their LLDP setting changed. W... [09:57:29] 10netops, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Servers exposing incorrect LLDP info - https://phabricator.wikimedia.org/T250367#11536036 (10ayounsi) 05Open→03Stalled a:05ayounsi→03None [09:57:30] 10netops, 10homer, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Homer: Netbox driven switch interfaces - https://phabricator.wikimedia.org/T250429#11536039 (10ayounsi) [10:05:14] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), and 2 others: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11536050 (10MoritzMuehlenhoff) I suggest we first move to the latest 6.12 backport to rule that this isn't a... [10:24:25] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), and 2 others: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11536148 (10MoritzMuehlenhoff) >>! In T414460#11536050, @MoritzMuehlenhoff wrote: > I suggest we first move... [10:31:05] 06Traffic, 06SRE, 13Patch-For-Review: Offer AuthDNS service over IPv6 - https://phabricator.wikimedia.org/T81605#11536203 (10cmooney) >>! In T81605#11535605, @ayounsi wrote: > A maybe safer alternative is to first enable IPv6 BGP peering between the network and all the dnsbox with `profile::bird::do_ipv6: tr... [11:06:47] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11536381 (10BTullis) >>! In T414460#11536148, @MoritzMuehlenhoff wrote: >>>! In T414460#11536050, @Mor... [11:15:41] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11536410 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [11:36:24] 06Traffic, 10Prod-Kubernetes, 06ServiceOps new, 07Kubernetes, 13Patch-For-Review: Handling inbound IPIP traffic on low traffic LVS k8s based realservers - https://phabricator.wikimedia.org/T352956#11536698 (10JMeybohm) Summarizing the current state and our recent discussion about this: - All calico (Pod... [11:44:54] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11536863 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [13:04:21] 06Traffic, 06MW-Interfaces-Team, 06ServiceOps new, 10ServiceOps-SharedInfra: map the /api/ prefix to /w/rest.php - https://phabricator.wikimedia.org/T364400#11537181 (10MLechvien-WMF) @BPirkle do you have an update on the plans for this? [14:16:33] 06Traffic, 10API Platform, 10MediaWiki-User-login-and-signup, 06MediaWiki-Platform-Team (Q3 Kanban Board): Login with `action=login` and bot password does not create a JWT session cookie - https://phabricator.wikimedia.org/T415007#11537524 (10Tgr) FWIW we decided not to support JWT sessions for bot passwor... [15:10:44] 06Traffic, 06Commons: HTTP 429 error on original image requests on Commons (iOS app by default hiding the Referrer header) - https://phabricator.wikimedia.org/T413570#11537701 (10Jonesey95) I was not sent a link to the commit. I had written an e-mail to noc@wikimedia.org, the address shown in the error message... [15:16:03] 06Traffic, 06SRE, 13Patch-For-Review: Offer AuthDNS service over IPv6 - https://phabricator.wikimedia.org/T81605#11537743 (10ssingh) Yeah, unless we update our own zone files but more specifically, Markmonitor, nothing really changes so we can just go ahead with the approach @cmooney suggested and enable it... [15:19:04] 10netops, 06Infrastructure-Foundations, 06SRE: InboundInterfaceErrors alerts firing for Nokia switches on v25.10.1 - https://phabricator.wikimedia.org/T412733#11537756 (10VRiley-WMF) [15:19:18] puppet disabled on A:cp for rolling out CSP change. please don't enable without checking with me, thanks :> [15:20:38] urgh [15:20:41] sukhe: do you have an ETA? [15:20:50] vgutierrez: I haven't started, so I can stop [15:20:53] change also not merged [15:20:57] do you want to do something first? [15:21:03] I'd like to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/1228568 first [15:21:06] ok [15:21:09] go ahead, let me re-enable [15:21:23] scope is A:cp-upload_ulsfo [15:21:30] give me a sec, re-enabling [15:23:24] sure [15:23:28] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: Export development_network_probe data to Puppet servers for CDN deployment - https://phabricator.wikimedia.org/T402512#11537796 (10BTullis) >>! In T402512#11529638, @elukey wrote: > Next steps: > > * DP to create the Airflow SRE instance. That's... [15:25:13] ok.. puppet is enabled in A:cp-upload_ulsfo.. I'll proceed [15:25:16] vgutierrez: all yours [15:27:43] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, 10Thumbor: Updated measurement of request frequency of thumbnail sizes - https://phabricator.wikimedia.org/T415080 (10MatthewVernon) 03NEW [15:28:37] sukhe: thx, running puppet as we speak [15:28:50] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, 10Thumbor: Updated measurement of request frequency of thumbnail sizes - https://phabricator.wikimedia.org/T415080#11537841 (10MatthewVernon) 05Open→03Resolved p:05Triage→03High The query used (the same as last time, modulo da... [15:37:57] vgutierrez: let me know when you are done [15:38:08] sukhe: I am [15:38:17] ok, I am moving ahead [15:41:47] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11537913 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [15:49:23] 06Traffic, 06Data-Persistence, 10MediaViewer, 10SRE-swift-storage, 10Thumbor: Updated measurement of request frequency of thumbnail sizes - https://phabricator.wikimedia.org/T415080#11537953 (10Ladsgroup) I've collected and analyzed requests to non-standard thumbnail sizes: 2/3rd are medium browser s... [16:02:54] puppet re-enabled everywhere [16:32:34] 06Traffic, 06DC-Ops, 10ops-eqsin, 06SRE: cp5022 is unreachable - https://phabricator.wikimedia.org/T414411#11538196 (10RobH) p:05Medium→03High They've connected a crash cart and the host is hard down. Seems we have a bad mainboard or a bad PSU controller. I'm typing up directions for Jin@ DreamIIC fo... [18:13:46] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11538714 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [18:43:38] 10netops, 06Infrastructure-Foundations, 06SRE, 06Data-Platform-SRE (2026.01.05 - 2026.01.23), 07Essential-Work: Socket leaking on some dse-k8s row C & D hosts - https://phabricator.wikimedia.org/T414460#11538831 (10ops-monitoring-bot) Roll-reboot of nodes in dse-eqiad cluster started by btullis: * dse-k8... [19:24:31] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113 (10bd808) 03NEW [19:24:43] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113#11538979 (10bd808) p:05Triage→03High [19:54:37] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113#11539051 (10ssingh) ` Jan 20 16:51:40 deployment-cache-text08 systemd[1]: haproxy.service: Control process exited, code=exited, status=1/FAILUR... [19:55:25] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113#11539054 (10ssingh) ` sukhe@deployment-cache-text08:~$ sudo haproxy -f /etc/haproxy/conf.d/tls.cfg [NOTICE] (17847) : haproxy version is 2.8... [20:07:54] 06Traffic, 06SRE, 13Patch-For-Review: Offer AuthDNS service over IPv6 - https://phabricator.wikimedia.org/T81605#11539224 (10ssingh) There is another layer of complexity we need to be aware of. Essentially, `authdns_addrs` in `hieradata/common.yaml` has the list of v4 authdns IP records, and now will have th... [20:19:19] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113#11539242 (10ssingh) @SLyngshede-WMF: See if you can find time to look into this when you come online, or I will tomorrow. Thanks! [20:25:30] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113#11539251 (10bd808) Running the config check with all of the config files gives a different error: `lang=shell-session,counterexample bd808@depl... [20:28:33] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 - https://phabricator.wikimedia.org/T415113#11539257 (10bd808) >>! In T415113#11539251, @bd808 wrote: > I wonder if that `lua.check_traffic_class` method is coming from a private location... [20:29:18] 06Traffic, 10Beta-Cluster-Infrastructure: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 because of missing `traffic_class.lua` library - https://phabricator.wikimedia.org/T415113#11539262 (10bd808) [20:40:37] 06Traffic, 10Beta-Cluster-Infrastructure, 13Patch-For-Review: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 because of missing `traffic_class.lua` library - https://phabricator.wikimedia.org/T415113#11539294 (10bd808) `lang=shell-session bd808@deployment-cache-text08.deplo... [20:44:25] 06Traffic, 10Beta-Cluster-Infrastructure, 13Patch-For-Review: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 because of missing `traffic_class.lua` library - https://phabricator.wikimedia.org/T415113#11539309 (10bd808) 05Open→03In progress a:03bd808 Cherry-pick has th... [21:01:03] 06Traffic, 10Beta-Cluster-Infrastructure, 13Patch-For-Review: HAProxy failing to start on deployment-cache-text08 and deployment-cache-upload08 because of missing `traffic_class.lua` library - https://phabricator.wikimedia.org/T415113#11539426 (10bd808) [21:01:05] 06Traffic, 10Beta-Cluster-Infrastructure: Puppet agent failure detected on instance deployment-cache-text08 in project deployment-prep - https://phabricator.wikimedia.org/T415115#11539430 (10bd808) →14Duplicate dup:03T415113 [23:41:17] 06Traffic, 10Huggle: Huggle is getting rate-limited when working on multiple wikis in parallel - https://phabricator.wikimedia.org/T415141#11539816 (10Reedy)