[06:52:57] (EdgeTrafficDrop) firing: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org [06:57:57] (EdgeTrafficDrop) resolved: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org [10:29:00] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10Vgutierrez) 05Open→03In progress [10:49:25] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1001 for host cp4026.ulsfo.wmnet with OS buster [10:55:57] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp4026:9331 is unreachable - https://alerts.wikimedia.org [11:01:30] 10Traffic, 10SRE, 10Patch-For-Review: Configure dns and puppet repositories for new drmrs datacenter - https://phabricator.wikimedia.org/T282787 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host ganeti6002.drmrs.wmnet with OS buster [11:17:38] ^^ cp4026 is being reimaged [11:32:44] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1001 for host cp4026.ulsfo.wmnet with OS buster e... [11:41:45] 10Traffic, 10SRE, 10Patch-For-Review: Configure dns and puppet repositories for new drmrs datacenter - https://phabricator.wikimedia.org/T282787 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host ganeti6002.drmrs.wmnet with OS buster executed with errors: - gan... [12:28:22] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1001 for host cp4026.ulsfo.wmnet with OS buster [12:45:57] (VarnishPrometheusExporterDown) resolved: Varnish Exporter on instance cp4026:9331 is unreachable - https://alerts.wikimedia.org [13:19:57] (VarnishPrometheusExporterDown) firing: Varnish Exporter on instance cp4026:9331 is unreachable - https://alerts.wikimedia.org [13:29:57] (VarnishPrometheusExporterDown) resolved: Varnish Exporter on instance cp4026:9331 is unreachable - https://alerts.wikimedia.org [13:32:26] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1001 for host cp4026.ulsfo.wmnet with OS buster c... [13:35:36] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10cloud-services-team (Kanban): cr-codfw: set up static route for 185.15.57.8/30 - https://phabricator.wikimedia.org/T295288 (10aborrero) [13:36:09] 10netops, 10Cloud-VPS, 10Infrastructure-Foundations, 10cloud-services-team (Kanban): cr-codfw: set up static route for 185.15.57.8/30 - https://phabricator.wikimedia.org/T295288 (10aborrero) p:05Triage→03Medium [15:30:01] 10netops, 10Infrastructure-Foundations, 10SRE: Can't commit on asw-b-codfw - https://phabricator.wikimedia.org/T295118 (10ayounsi) a:05ayounsi→03Papaul According to @akosiaris this is due to a failed hard drive, and it might not come back up from a reboot. @Papaul when you're back, let's replace FPC7 wi... [16:28:54] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Can't commit on asw-b-codfw - https://phabricator.wikimedia.org/T295118 (10wiki_willy) [17:57:59] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Q2:(Need By: TBD) Rows E/F network racking task - https://phabricator.wikimedia.org/T292095 (10wiki_willy) a:03Jclark-ctr [22:34:13] 10Traffic, 10SRE, 10serviceops, 10Performance-Team (Radar): Reconcile MediaWiki POST timeout and Varnish/ATS timeouts - https://phabricator.wikimedia.org/T294800 (10colewhite) p:05Triage→03Medium [22:34:36] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10procurement: drmrs: primary software task - https://phabricator.wikimedia.org/T282788 (10colewhite) p:05Triage→03Medium [22:38:07] 10netops, 10Infrastructure-Foundations, 10SRE, 10SRE Observability (FY2021/2022-Q2), 10Sustainability (Incident Followup): Alert that should have paged did not reach VictorOps because of partial networking outage - https://phabricator.wikimedia.org/T294166 (10colewhite) p:05Triage→03Medium