[06:27:34] 06Traffic, 10conftool: [EPIC] FY 24/25 WE 4.3.4 Improve our existing tooling to allow quicker reaction times to ongoing attacks. - https://phabricator.wikimedia.org/T369480#9959649 (10Joe) [06:28:03] 06Traffic, 10conftool: [EPIC] FY 24/25 WE 4.3.4 Improve our existing tooling to allow quicker reaction times to ongoing attacks. - https://phabricator.wikimedia.org/T369480#9959651 (10Joe) [06:28:04] 06Traffic, 10conftool, 13Patch-For-Review, 10Sustainability (Incident Followup): requestctl can't act on cache hits - https://phabricator.wikimedia.org/T317794#9959652 (10Joe) [07:20:04] 10netops, 06Infrastructure-Foundations: BGP status (instance cr2-eqord) - April 2024 - Equinix peering AS15830 - https://phabricator.wikimedia.org/T363895#9959755 (10ayounsi) 05Open→03Resolved a:03ayounsi [10:51:40] FIRING: [10x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [10:54:04] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9960396 (10Lucas_Werkmeister_WMDE) {T355292} should probably be a subtask of this (or maybe a subtask of T321899)? At least I’ve been told th... [10:56:23] 06Traffic, 10MoveComms-Support, 10MW-on-K8s, 06serviceops, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9960410 (10Clement_Goubert) [10:56:25] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9960411 (10Clement_Goubert) [10:56:40] FIRING: [16x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:06:40] FIRING: [20x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:09:24] 06Traffic, 10MoveComms-Support, 10MW-on-K8s, 06serviceops, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9960448 (10Clement_Goubert) [11:11:40] FIRING: [23x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:16:34] 06Traffic, 10MoveComms-Support, 10MW-on-K8s, 06serviceops, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9960456 (10Clement_Goubert) [11:21:19] 06Traffic, 10MoveComms-Support, 10MW-on-K8s, 06serviceops, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9960472 (10Clement_Goubert) 05Open→03Resolved The work this task tracked is now completed. Remaining migrations {T352650}, {T355292}, {T35... [11:26:40] FIRING: [16x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:29:07] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9960486 (10Clement_Goubert) 05Open→03In progress [11:31:40] FIRING: [15x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:34:26] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, and 2 others: Migrate internal traffic to k8s - https://phabricator.wikimedia.org/T333120#9960507 (10Clement_Goubert) 05In progress→03Resolved All internal traffic has been migrated. [11:36:02] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Deploy mediawiki kubernetes services - https://phabricator.wikimedia.org/T321786#9960518 (10Clement_Goubert) [11:36:40] FIRING: [23x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:37:52] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9960524 (10Clement_Goubert) >>! In T290536#9960396, @Lucas_Werkmeister_WMDE wrote: > {T355292} should probably be a subtask of this (or maybe... [11:41:40] 06Traffic, 10MW-on-K8s, 06serviceops, 06SRE, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536#9960525 (10Clement_Goubert) [11:50:41] sukhe: I sent ~30 pering requests to IX.BR networks earlier today. Based on our netflow data, and the weekly peering email. It's not that much compared to the huge number of networks there, but a start. 2 have already replied. [11:56:40] RESOLVED: [8x] VarnishHighThreadCount: Varnish's thread count on cp1100:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [12:00:28] 06Traffic, 10MoveComms-Support, 10MW-on-K8s, 06serviceops, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9960571 (10Lucas_Werkmeister_WMDE) /me shakes fist at Phorge for not letting me award this task another token 🪙🪙🪙🪙🪙 [12:01:24] XioNoX: nice and thanks! (BR goes live this week :) [12:16:56] \o/ [13:25:33] 10netops, 06Infrastructure-Foundations, 06SRE: Move asw-c-codfw and asw-d-codfw CR uplinks to Spine switches - https://phabricator.wikimedia.org/T366941#9960830 (10Papaul) @cmooney the 18th works for me thanks. [13:26:58] 06Traffic, 06SRE, 13Patch-For-Review: Migrate DNS depooling of sites from operations/dns (git) to confctl - https://phabricator.wikimedia.org/T369366#9960835 (10ssingh) [13:27:08] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade Management routers to 22.4R3-S2 - https://phabricator.wikimedia.org/T369504 (10cmooney) 03NEW p:05Triage→03Medium [13:44:29] 06Traffic, 06MW-Interfaces-Team, 06serviceops, 13Patch-For-Review: map the /api/ prefix to /w/rest.php - https://phabricator.wikimedia.org/T364400#9960984 (10akosiaris) >>! In T364400#9926454, @Joe wrote: >>>! In T364400#9780622, @BBlack wrote: >>>>! In T364400#9779996, @hnowlan wrote: >>> Could we impleme... [13:59:07] 07HTTPS, 06Traffic, 10MediaWiki-Action-API, 10MediaWiki-REST-API, and 3 others: Proposal: fail explicitly and revoke relevant API keys over plain-text HTTP connection for all Wikimedia APIs - https://phabricator.wikimedia.org/T368344#9961059 (10pmiazga) [14:01:20] 07HTTPS, 06Traffic, 10MediaWiki-Action-API, 10MediaWiki-REST-API, and 3 others: Proposal: fail explicitly and revoke relevant API keys over plain-text HTTP connection for all Wikimedia APIs - https://phabricator.wikimedia.org/T368344#9961062 (10pmiazga) Tagging #mw-interfaces-team as they are API owners. [16:02:03] 10netops, 06Infrastructure-Foundations, 06SRE: Move asw-c-codfw and asw-d-codfw CR uplinks to Spine switches - https://phabricator.wikimedia.org/T366941#9961749 (10cmooney) [16:06:47] 10netops, 06Infrastructure-Foundations, 06SRE: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - https://phabricator.wikimedia.org/T348977#9961762 (10cmooney) [16:07:19] 10netops, 06Data-Persistence, 06DBA, 06Infrastructure-Foundations, and 2 others: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad - https://phabricator.wikimedia.org/T365998#9961779 (10cmooney) [17:34:45] 06Traffic, 10DNS, 10fundraising-tech-ops, 06SRE, 13Patch-For-Review: Cleanup unused DNS subdomains - https://phabricator.wikimedia.org/T367012#9962213 (10Dzahn) Thanks @AKanji-WMF Are you still using http://mandrillapp.com/ / MailChimp for fundraising emails with benefactors.wikimedia.org ? [20:59:21] 06Traffic, 06SRE: Regression: Reading spam blacklists of all projects suddenly returns status 429 on fifth consecutive read - https://phabricator.wikimedia.org/T369414#9963012 (10bd808) I expect that the `?action=raw` query string is what is causing you to run into a rate limit. I think you will have a better... [22:20:12] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:25:12] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:35:12] FIRING: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:38:32] FIRING: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:40:12] RESOLVED: [2x] SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:43:32] RESOLVED: SLOMetricAbsent: - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [22:54:41] FIRING: VarnishPrometheusExporterDown: Varnish Exporter on instance cp3073:9331 is unreachable - https://wikitech.wikimedia.org/wiki/Prometheus#Prometheus_job_unavailable - https://grafana.wikimedia.org/d/000000304/varnish-dc-stats?viewPanel=17 - https://alerts.wikimedia.org/?q=alertname%3DVarnishPrometheusExporterDown [22:55:35] ^Looks like f.abfur is rebooting