[00:39:57] (HAProxyEdgeTrafficDrop) firing: 36% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [00:44:56] (HAProxyEdgeTrafficDrop) firing: (6) 47% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [00:49:56] (HAProxyEdgeTrafficDrop) resolved: (6) 59% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [07:05:47] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, 10IPv6: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 (10ayounsi) > fixed up lvs[4005-4007].ulsfo.wmnet For context: T311290 The the issue is twofold: 1/ the LVS hosts use SLAAC IPs on their... [07:38:15] XioNoX: re T271144, do we need those AAAA records at all? for that particular task as long as the main ifaces of the lvs get an AAAA record it should be enough, right? [07:38:16] T271144: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 [07:40:43] vgutierrez: depends on the the definition of "need" :) [07:41:27] vgutierrez: for the DNS point of view consistency of having primary IPv6 IP with DNS yes, for the rest I'll leave it up to you, I was about to comment on task [07:41:50] vgutierrez: it doesn't break anything not to have them, on the other hand it's about being consistent across the infra [07:42:45] in other words, I won't get mad if we don't add them, but I think it's cleaner to have them :) [07:48:38] * volans replied [07:48:39] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, 10IPv6: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 (10Volans) >>! In T271144#8057351, @BCornwall wrote: > Thank you for doing that, @Volans ; I apologize for forgetting to run the cookbook.... [08:59:39] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, 10IPv6: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 (10Volans) Just for completeness, and to use the same wording I'm using for other tasks. Some clusters managed by the Traffic team have in... [15:19:56] (HAProxyEdgeTrafficDrop) firing: 51% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:24:56] (HAProxyEdgeTrafficDrop) resolved: (2) 49% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [17:36:38] (LVSHighCPU) firing: The host lvs1020:9100 has at least its CPU 20 saturated - https://bit.ly/wmf-lvscpu - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs1020 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [17:41:38] (LVSHighCPU) resolved: The host lvs1020:9100 has at least its CPU 20 saturated - https://bit.ly/wmf-lvscpu - https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=lvs1020 - https://alerts.wikimedia.org/?q=alertname%3DLVSHighCPU [18:31:53] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, 10IPv6: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 (10BCornwall) 05Open→03Resolved Thank you for the help, @ssingh, @Volans and @ayounsi I've added the DNS records to only the primary i... [18:38:26] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, 10IPv6: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 (10Volans) Thanks @BCornwall for the quick turnaround and fix. I'll close the tmux then given the revert is not needed anymore. [18:48:39] 10Traffic, 10Infrastructure-Foundations, 10SRE, 10SRE-tools, and 2 others: Some Traffic clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271144 (10Volans) I've removed the lvs prefix from the no IPv6 cluster list and now the Network report in Netbox confirms there are no lvs hos... [19:19:56] 10Traffic, 10Observability-Alerting, 10SRE, 10Patch-For-Review, 10User-fgiunchedi: Migrate Traffic Prometheus alerts from Icinga to Alertmanager - https://phabricator.wikimedia.org/T300723 (10BCornwall) [19:20:05] 10Traffic, 10SRE: Create vm.max_map_count metrics for Prometheus - https://phabricator.wikimedia.org/T311445 (10BCornwall) 05Open→03Resolved Implemented and deployed to varnish servers. The `sysctl_vm_max_map_count` metric is now available. [19:43:24] nice job brett! [21:48:38] I am about to merge an envoy patch, would anyone in here like to ride along? [21:48:48] I mean, a patch /using/ envoy, not a patch changing envoy [21:49:41] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Finalise design extension of WMCS networks to new cloudsw in Eqiad rows E/F - https://phabricator.wikimedia.org/T304989 (10nskaggs) @cmooney Thank you for updating and linking these instructions. Yes, that is helpful