[05:43:56] (HAProxyEdgeTrafficDrop) firing: 67% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [05:48:56] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [06:54:18] 10netops, 10Infrastructure-Foundations, 10Observability-Metrics, 10SRE, and 2 others: LibreNMS seemingly not collecting data for many ports after migration to netmon1003 - https://phabricator.wikimedia.org/T314972 (10andrea.denisse) [07:49:56] (HAProxyEdgeTrafficDrop) firing: 39% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [07:54:56] (HAProxyEdgeTrafficDrop) resolved: (2) 55% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [08:56:46] 10Traffic, 10SRE: ATS isn't honoring the cache policy set in cache::alternate_domains on some cases - https://phabricator.wikimedia.org/T316545 (10Jersione) What do I do [12:24:16] (VarnishTrafficDrop) firing: Varnish traffic in eqiad has dropped 67.1301790634479% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [12:24:56] (HAProxyEdgeTrafficDrop) firing: (3) 54% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [12:29:16] (VarnishTrafficDrop) resolved: (3) Varnish traffic in eqiad has dropped 58.394596263303626% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [12:29:56] (HAProxyEdgeTrafficDrop) resolved: (5) 66% request drop in text@drmrs during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [13:17:57] (HAProxyEdgeTrafficDrop) firing: 60% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [13:22:57] (HAProxyEdgeTrafficDrop) resolved: 60% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [14:06:31] 10Traffic, 10SRE, 10Patch-For-Review: Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ssingh) Downgrading and reimaging drmrs ATS9 hosts cp6008 and cp6016 to ATS8 for a week so that we can have comparative data later when we upgrade all instances to ATS9 in drmrs. [14:07:03] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10Krinkle) [14:08:23] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6008.drmrs.wmnet with OS buster [14:09:45] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6008.drmrs.wmnet with OS buster executed with errors: - cp6008 (**FAIL... [14:16:15] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6008.drmrs.wmnet with OS buster [15:00:49] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6008.drmrs.wmnet with OS buster completed: - cp6008 (**WARN**) - Dow... [15:07:38] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6016.drmrs.wmnet with OS buster [15:55:46] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Package and deploy ATS 9.1.3 - https://phabricator.wikimedia.org/T309651 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6016.drmrs.wmnet with OS buster completed: - cp6016 (**WARN**) - Dow... [19:24:14] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10Jclark-ctr) What is the status on this one? it has been sitting for a while [20:38:16] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] - https://phabricator.wikimedia.org/T289882 (10Andrew) @Jclark-ctr These are blocked on a variety of tech decisions; no action needed in the DC for now. Thanks for checking in! [20:58:41] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Undocumented IP on WMCS network - https://phabricator.wikimedia.org/T315955 (10Andrew) 05Open→03Resolved [21:06:57] (HAProxyEdgeTrafficDrop) firing: 30% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [21:07:16] (VarnishTrafficDrop) firing: (3) Varnish traffic in eqsin has dropped 26.309597332117146% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [21:11:57] (HAProxyEdgeTrafficDrop) firing: (3) 67% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [21:12:16] (VarnishTrafficDrop) resolved: (5) Varnish traffic in eqsin has dropped 20.921785807953956% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [21:16:57] (HAProxyEdgeTrafficDrop) resolved: (3) 68% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop