[00:02:57] (HAProxyEdgeTrafficDrop) firing: 68% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=drmrs&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [00:07:57] (HAProxyEdgeTrafficDrop) resolved: 68% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=drmrs&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [05:23:57] (HAProxyEdgeTrafficDrop) firing: 66% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [05:28:56] (HAProxyEdgeTrafficDrop) resolved: 66% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [10:34:15] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10ayounsi) Please don't forget to run Homer after re-naming as the switch port description contains the hostname. The current outstan... [11:53:56] 10Traffic, 10Performance-Team, 10SRE, 10serviceops: Decide on details of progressive Multi-DC roll out - https://phabricator.wikimedia.org/T279664 (10tstarling) If we send a percentage of traffic to the local DC, is it necessary (for sessions etc.) to consistently send a given user to the same DC? [12:40:48] 10netops, 10Infrastructure-Foundations: DHCPd: update config to log more info - https://phabricator.wikimedia.org/T309524 (10jbond) p:05Triage→03Medium [13:44:35] 10Traffic, 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, 10User-zeljkofilipin: intake-analytics is responsible for up to a 85% of varnish backend fetch errors - https://phabricator.wikimedia.org/T306181 (10akosiaris) >>! In T306181#7963731, @phuedx wrote: >>>! In T306181#7914450, @akosiaris wrote... [14:05:57] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Replace labstore100[67] with clouddumps100[12] - https://phabricator.wikimedia.org/T309346 (10ArielGlenn) a:05ArielGlenn→03None Not sure who should get this next but it's not Hannah or I :-) I was never involved in the configuratio... [14:45:14] 10netops, 10Infrastructure-Foundations, 10SRE: DHCPd: update config to log more info - https://phabricator.wikimedia.org/T309524 (10Volans) IIRC that hostname is evaluated by the DHCP at restart time and then the resulting IP is used in the configuration. Because that's a valid hostname in our DNS it would h... [14:48:31] 10Traffic, 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, and 2 others: intake-analytics is responsible for up to a 85% of varnish backend fetch errors - https://phabricator.wikimedia.org/T306181 (10BTullis) Thanks @phuedx and @akosiaris for that information and for the patch. That's a great find ab... [17:42:44] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Replace labstore100[67] with clouddumps100[12] - https://phabricator.wikimedia.org/T309346 (10MoritzMuehlenhoff) p:05Triage→03Medium [18:31:57] (HAProxyEdgeTrafficDrop) firing: 59% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [18:36:57] (HAProxyEdgeTrafficDrop) resolved: 62% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop