[00:17:57] 06Traffic, 10MobileFrontend, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review: MobileFrontend should declare "X-Subdomain" variance via "Vary" response header - https://phabricator.wikimedia.org/T390929#11037980 (10Krinkle) There aren't many hooks on load.php. That's largely a strength in that over 15... [02:54:43] FIRING: [2x] HaproxyKafkaExporterDown: HaproxyKafka on cp3070 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [02:59:43] FIRING: [5x] HaproxyKafkaExporterDown: HaproxyKafka on cp3067 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [03:04:43] FIRING: [7x] HaproxyKafkaExporterDown: HaproxyKafka on cp3067 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [03:09:43] FIRING: [7x] HaproxyKafkaExporterDown: HaproxyKafka on cp3067 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [03:54:43] FIRING: HaproxyKafkaExporterDown: HaproxyKafka on cp3069 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=esams&var-instance=cp3069 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [04:29:43] RESOLVED: [2x] HaproxyKafkaExporterDown: HaproxyKafka on cp3069 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=esams&var-instance=cp3069 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [04:35:43] FIRING: HaproxyKafkaExporterDown: HaproxyKafka on cp3069 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=esams&var-instance=cp3069 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [04:45:43] RESOLVED: HaproxyKafkaExporterDown: HaproxyKafka on cp3069 is down - https://wikitech.wikimedia.org/wiki/HAProxyKafka#HaproxyKafkaExporterDown - https://grafana.wikimedia.org/d/d3e4e37c-c1d9-47af-9aad-a08dae2b3fd5/haproxykafka?orgId=1&var-site=esams&var-instance=cp3069 - https://alerts.wikimedia.org/?q=alertname%3DHaproxyKafkaExporterDown [07:01:23] 06Traffic, 10Hiddenparma, 06SRE: Browser behaviour detection at the edge - https://phabricator.wikimedia.org/T400270#11038157 (10Joe) p:05Triage→03High a:05Joe→03None [07:02:09] 06Traffic, 10Hiddenparma, 06SRE: Better mapping of requests coming from datacenters/clouds - https://phabricator.wikimedia.org/T400120#11038159 (10Joe) p:05Triage→03Medium a:05Joe→03None [08:57:28] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11038330 (10elukey) The ticket to Dell seems not going in the right direction, but we have some direct contact with them so I hope for some good follow ups this week. O... [10:22:35] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11038584 (10elukey) After reviewing the code with some fresh eyes/brain I realized that for UEFI we have already started to use the BIOS settings for PXE, but we haven't... [10:29:36] 10netops, 06Infrastructure-Foundations, 06SRE: Homer: PyEz "ignore_warnings" does not work for port-block speed change warning - https://phabricator.wikimedia.org/T400261#11038624 (10cmooney) 05Open→03Resolved a:03cmooney [11:38:41] 06Traffic, 06Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 10Wikimedia-Site-requests, 13Patch-For-Review: ESI test string is still shipped by CentralNotice - https://phabricator.wikimedia.org/T400472#11038813 (10R4356th) >>! In T400472#11034815, @ssingh wrote: > Thanks for reporting. At leas... [11:52:40] FIRING: VarnishHighThreadCount: Varnish's thread count on cp5032:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5032 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [11:55:09] FIRING: LVSHighRX: Excessive RX traffic on lvs5005:9100 (ens1f0np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs5005 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [11:57:40] FIRING: [12x] VarnishHighThreadCount: Varnish's thread count on cp5025:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [12:00:09] RESOLVED: LVSHighRX: Excessive RX traffic on lvs5005:9100 (ens1f0np0) - https://bit.ly/wmf-lvsrx - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs5005 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighRX [12:02:40] FIRING: [14x] VarnishHighThreadCount: Varnish's thread count on cp5025:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [12:17:40] FIRING: [10x] VarnishHighThreadCount: Varnish's thread count on cp5025:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [12:22:40] RESOLVED: [8x] VarnishHighThreadCount: Varnish's thread count on cp5025:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [12:43:20] 06Traffic, 06Fundraising-Backlog, 10MediaWiki-extensions-CentralNotice, 10Wikimedia-Site-requests, 13Patch-For-Review: ESI test string is still shipped by CentralNotice - https://phabricator.wikimedia.org/T400472#11038997 (10R4356th) a:03R4356th [13:48:45] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11039317 (10elukey) I fixed the above problems (see WIP patch), but then new ones popped up: ` {'error': {'@Message.ExtendedInfo': [{'Message': 'Invalid Attribute was '... [14:20:42] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Link errors: ssw1-d1-codfw <-> ssw1-f1-codfw - https://phabricator.wikimedia.org/T400253#11039411 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm looks like errors ceased after cleaning. no increments since friday. [14:38:00] FIRING: PurgedHighBacklogQueue: Large backlog queue for purged on cp5027:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5027 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [14:39:18] hmm [14:40:55] varnish@cp5027 is kinda stressed [14:42:40] FIRING: VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5027 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [14:43:00] RESOLVED: PurgedHighBacklogQueue: Large backlog queue for purged on cp5027:2112 - https://wikitech.wikimedia.org/wiki/Purged#Alerts - https://grafana.wikimedia.org/d/RvscY1CZk/purged?var-datasource=eqsin%20prometheus/ops&var-instance=cp5027 - https://alerts.wikimedia.org/?q=alertname%3DPurgedHighBacklogQueue [14:47:40] FIRING: [3x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [14:52:40] FIRING: [4x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [14:59:07] 06Traffic, 10Hiddenparma, 06SRE: Block traffic from user-agents not honoring our policy - https://phabricator.wikimedia.org/T400119#11039625 (10Joe) Given we recognize that enforcing our policy might cause disruption, we will proceed with care. First of all,** toolsforge is excluded from the block** for the... [15:02:40] FIRING: [5x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [15:05:17] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Link errors: ssw1-d1-codfw <-> ssw1-f1-codfw - https://phabricator.wikimedia.org/T400253#11039647 (10cmooney) Awesome, thank you! [15:17:40] FIRING: [5x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [15:22:40] FIRING: [4x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [15:28:44] 06Traffic: Can't build haproxykafka package anymore - https://phabricator.wikimedia.org/T400620#11039763 (10Fabfur) p:05Triage→03High [15:37:40] FIRING: [4x] VarnishHighThreadCount: Varnish's thread count on cp5027:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [15:57:40] RESOLVED: VarnishHighThreadCount: Varnish's thread count on cp5029:0 is high - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/wiU3SdEWk/cache-host-drilldown?viewPanel=99&var-site=eqsin&var-instance=cp5029 - https://alerts.wikimedia.org/?q=alertname%3DVarnishHighThreadCount [16:30:01] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11039946 (10elukey) Last one: ` UEFI0417: When TLS is enabled, insecure HTTP boot without TLS is not allowed. It is recommended to use HTTP boot over TLS for better sec... [16:49:39] 06Traffic, 10HaproxyKafka, 10Data-Platform-SRE (2025.07.26 - 2025.08.15), 13Patch-For-Review: Replicate current low-message alerting from VarnishKafka - https://phabricator.wikimedia.org/T391810#11040088 (10BTullis) [16:58:15] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11040242 (10elukey) I was able to reach Debian Install and this is what I got in the Partitions disks step: ` ┌───────────────────────┤ [!!] Partition disks ├───────... [17:29:27] 06Traffic, 06DC-Ops, 10ops-codfw, 13Patch-For-Review: Q4:rack/setup/install cp20[43-58] codfw - https://phabricator.wikimedia.org/T392851#11040394 (10ssingh) >>! In T392851#11040242, @elukey wrote: > I was able to reach Debian Install and this is what I got in the Partitions disks step: > > ` > ┌──────... [17:48:54] Is there a trick to get `varnishncsa` to show the client IP address? I'm trying to get a sense of what range blocks are actively dropping traffic in Beta Cluster and naively thought `varnishncsa -n frontend` would have the data I needed. [17:50:01] fwiw, I guess the real fix for you would be if you could use requestctl on the beta cluster. [17:50:18] with all the benefits of that web UI and existing patterns [17:52:42] That would be helpful for sure. [17:54:33] I think you have to get the req.http.X-Client-IP [17:54:45] 06Traffic, 13Patch-For-Review: Can't build haproxykafka package anymore - https://phabricator.wikimedia.org/T400620#11040487 (10Fabfur) [17:55:52] in https://wikitech.wikimedia.org/wiki/Varnish I see this line, when it talks about throttling: [17:55:56] if (vsthrottle.is_denied("proton_limiter:" + req.http.X-Client-IP, 10, 10s)) { [17:56:08] I think that second part of it there.. that is what you want to get. [17:56:22] somehow with varnincsa formatting [17:56:27] `$ sudo varnishncsa -n frontend -F "%t %s %{X-Client-IP}i %U"` does seem to give me what I was looking for. Thanks for the pointer mutante [17:56:35] ah, great! [17:57:42] that looks correct :) [17:57:50] thanks mutante! [17:59:18] :) [20:02:33] 06Traffic, 13Patch-For-Review: Can't build haproxykafka package anymore - https://phabricator.wikimedia.org/T400620#11041018 (10Fabfur)