[03:48:02] (HAProxyEdgeTrafficDrop) firing: 60% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [04:17:56] (HAProxyEdgeTrafficDrop) resolved: 68% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [04:36:56] (HAProxyEdgeTrafficDrop) firing: 64% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [04:41:56] (HAProxyEdgeTrafficDrop) resolved: 63% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [10:50:37] We had two short interruptions to the BGP sessions to lvs2009 over the past 20 minutes. [10:50:53] https://phabricator.wikimedia.org/P24503 [10:51:10] Happened on both CRs simultaneously, seems ok again now [10:51:18] I assume there was no work ongoing that would account for that? [10:52:44] topranks: yes, see -sre [10:56:56] topranks: sorry I meant -ops [10:58:42] ah ok... was another problem there for me :) [11:02:49] Iyeah I noticed [12:51:56] (HAProxyEdgeTrafficDrop) firing: 58% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [12:56:56] (HAProxyEdgeTrafficDrop) resolved: 64% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [13:38:58] 10Traffic, 10SRE, 10Patch-For-Review: Update certspotter - https://phabricator.wikimedia.org/T204993 (10ssingh) [13:39:07] 10Acme-chief, 10Traffic, 10SRE: Integrate certspotter with certcentral to avoid certspotter notifying us on legitimate certs generated by our certcentral boxes - https://phabricator.wikimedia.org/T204994 (10ssingh) [13:49:38] bblack, vgutierrez: if you don't mind a sanity check of https://gerrit.wikimedia.org/r/c/operations/puppet/+/778332 it would be appreciated. PCC link in the comments [14:32:55] volans looking good from here [14:33:11] great, thanks! [14:52:18] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: cp5002 memory errors on DIMM A4 - https://phabricator.wikimedia.org/T305423 (10ssingh) >>! In T305423#7841765, @wiki_willy wrote: > Hi @ssingh - since this server is out of warranty and due to be refreshed in a few quarters, do you still want us to purchase a replac... [14:55:20] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10Cmjohnson) [15:02:56] (HAProxyEdgeTrafficDrop) firing: 67% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:17:56] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@ulsfo during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=ulsfo&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [15:20:15] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host cloudstore1010.wikimedia.org w... [15:22:06] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10Cmjohnson) [15:23:10] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host cloudstore1010.wikimedia.org with... [15:23:57] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host cloudstore1010.wikimedia.org w... [15:28:05] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cmjohnson@cumin1001 for host cloudstore1011.wikimedia.org w... [15:46:38] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 2 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cmjohnson@cumin1001 for host cloudstore1010.wikimedia.org with... [17:12:02] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Unify loopback filters between CR routers and L3 switches - https://phabricator.wikimedia.org/T304553 (10cmooney) On the EVPN devices filtering needs to be defined on each 'unit' of the loopback interface, i.e. the default one "lo0.0" in th... [19:49:10] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: cp5002 memory errors on DIMM A4 - https://phabricator.wikimedia.org/T305423 (10wiki_willy) Thanks @ssingh. Rob's working on sourcing the replacement DIMM, so we should have that sorted out soon, and will keep you in the loop via an adjacent procurement task. Thank... [19:51:16] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: cp5002 memory errors on DIMM A4 - https://phabricator.wikimedia.org/T305423 (10RobH)