[05:51:56] (HAProxyEdgeTrafficDrop) firing: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [05:56:56] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [08:04:25] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade Fastnetmon to 1.2.1 - https://phabricator.wikimedia.org/T271228 (10MoritzMuehlenhoff) This was the debconf diff for the puppetised fastnetmon.conf as presented by dpkg. We should check whether some new options should be covered in our puppetised config f... [08:35:00] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade Fastnetmon to 1.2.1 - https://phabricator.wikimedia.org/T271228 (10ayounsi) Great, there is nothing of immediate interest in the diff. IPv6 will probably be the next step here in a different task. [08:47:50] 10netops, 10Infrastructure-Foundations, 10SRE: Upgrade Fastnetmon to 1.2.1 - https://phabricator.wikimedia.org/T271228 (10ayounsi) left are eqiad/esams/eqsin. I'll take care of them later today or tomorrow. [11:05:14] 10netops, 10Infrastructure-Foundations, 10SRE: DHCPd: update config to log more info - https://phabricator.wikimedia.org/T309524 (10cmooney) I agree @jbond it would be useful to have more granular detail. When we don't have a "match" on the dhcp snippet then we end up with a log like this: ` DHCPDISCOVER fr... [11:56:31] 10netops, 10Infrastructure-Foundations, 10SRE: DHCPd: update config to log more info - https://phabricator.wikimedia.org/T309524 (10jbond) Thanks for looking at this @Volans @cmooney > Because that's a valid hostname in our DNS it would have just used that IP. So not sure how to "prevent" this. Doh! > It... [12:20:33] 10netops, 10Infrastructure-Foundations, 10SRE: Cannot verify NTP status asw1-b12-drmrs - https://phabricator.wikimedia.org/T305840 (10cmooney) 05Open→03Resolved a:03cmooney After a bit of back-and-forth with Juniper they eventually suggests just killing the ntpd process from a root shell. Which has do... [15:34:45] 10Traffic, 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, and 2 others: intake-analytics is responsible for up to a 85% of varnish backend fetch errors - https://phabricator.wikimedia.org/T306181 (10akosiaris) Didn't work btw, turns out that eventgate also needs a service-runner bump. PR at https://... [15:42:25] 10Traffic, 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, and 2 others: intake-analytics is responsible for up to a 85% of varnish backend fetch errors - https://phabricator.wikimedia.org/T306181 (10Ottomata) Ah sorry about that, should have realized. Docs here: https://wikitech.wikimedia.org/wiki/... [15:45:27] 10Traffic, 10Data-Engineering, 10Data-Engineering-Kanban, 10SRE, and 2 others: intake-analytics is responsible for up to a 85% of varnish backend fetch errors - https://phabricator.wikimedia.org/T306181 (10Ottomata) Hm, I think we stopped using the github commit sha to install, and instead rely on NPM lik... [16:14:26] vgutierrez: what bribe is needed to get you to +2 that developer.wikimedia.org CDN config patch? https://gerrit.wikimedia.org/r/c/operations/puppet/+/800181/ [16:15:01] I accept bribes in craft beer and/or specialty coffee beans [16:15:12] ;P [16:16:55] bd808: jokes aside, the TLS side between ats-be and the service looks good, caching rules and remap rules are fine. I already +1ed it [16:41:59] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad, 10cloud-services-team (Kanban): Replace labstore100[67] with clouddumps100[12] - https://phabricator.wikimedia.org/T309346 (10Cmjohnson) This task does not require DC-OPs tag, once you have moved the data, please decommission labstores and crea... [17:47:03] 10netops, 10Infrastructure-Foundations, 10SRE: codfw: Provision a server script can not run without a cable ID" - https://phabricator.wikimedia.org/T308768 (10Papaul) 05Open→03Resolved I tested this on backup2009 all is working with no issues. Thanks [18:58:31] 10Traffic, 10SRE: Package and deploy ATS 9.1.4 - https://phabricator.wikimedia.org/T309651 (10ssingh) [19:00:03] 10Traffic, 10SRE: Package and deploy ATS 9.1.4 - https://phabricator.wikimedia.org/T309651 (10ssingh) ` trafficserver (9.1.2-1wm1) buster-wikimedia; urgency=medium * Non-maintainer upload. * New upstream release 9.1.2 -- Sukhbir Singh Tue, 31 May 2022 13:34:20 -0400 ` [19:00:09] 10Traffic, 10SRE: Package and deploy ATS 9.1.2 - https://phabricator.wikimedia.org/T309651 (10ssingh) [22:01:57] (HAProxyEdgeTrafficDrop) firing: 51% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [22:06:57] (HAProxyEdgeTrafficDrop) resolved: (2) 67% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [22:49:57] (HAProxyEdgeTrafficDrop) firing: 69% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=drmrs&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [22:59:57] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=drmrs&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop