[05:52:56] (HAProxyEdgeTrafficDrop) firing: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [06:02:56] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [06:39:05] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE: Agree strategy for Kubernetes BGP peering to top-of-rack switches - https://phabricator.wikimedia.org/T306649 (10ayounsi) > That might have unwanted implications in case of power or network issues on one row. That's fine, we're moving from a... [07:13:03] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade Fastnetmon to 1.2.1 - https://phabricator.wikimedia.org/T271228 (10ayounsi) [08:02:57] (HAProxyEdgeTrafficDrop) firing: 66% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [08:07:57] (HAProxyEdgeTrafficDrop) resolved: (2) 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [11:40:46] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE: Agree strategy for Kubernetes BGP peering to top-of-rack switches - https://phabricator.wikimedia.org/T306649 (10elukey) Added the proposed node labels to ml-serve-eqiad via T308418#7930118. At this point I'll wait to see what strategy is be... [14:41:50] 10netops, 10Infrastructure-Foundations, 10Prod-Kubernetes, 10SRE: Agree strategy for Kubernetes BGP peering to top-of-rack switches - https://phabricator.wikimedia.org/T306649 (10akosiaris) > Regarding the "fake nodes": I think that could be done with adding the leafs as [[ https://projectcalico.docs.tiger... [16:06:54] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Upgrade pfw to Junos 20+ - https://phabricator.wikimedia.org/T295691 (10Papaul) Junos upgrade complete in Eqiad. [16:08:37] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Upgrade pfw to Junos 20+ - https://phabricator.wikimedia.org/T295691 (10Papaul) 05Open→03Resolved [17:10:15] 10Traffic, 10ops-codfw: codfw: cp2038 Correctable memory error on DIMM A3 - https://phabricator.wikimedia.org/T308459 (10Papaul) [17:10:51] 10Traffic, 10ops-codfw: codfw: cp2038 Correctable memory error on DIMM A3 - https://phabricator.wikimedia.org/T308459 (10Papaul) p:05Triage→03Medium [21:53:35] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo: ganeti4002 dimm error - https://phabricator.wikimedia.org/T303318 (10RobH) Shipped and arrived via 559967799450, opened 00781129 for the shipment and will go down this week to swap it out.