[01:48:26] 10netops, 10DC-Ops, 10Infrastructure-Foundations, 10SRE, and 3 others: Q3:(Need By: TBD) rack/setup/install 2 new labstore hosts - https://phabricator.wikimedia.org/T302981 (10Andrew) I am now running the epic rsync from labstore1006 to clouddumps100[12]. Going to take a while! [05:43:05] 10netops, 10Cloud-Services, 10Infrastructure-Foundations: Undocumented IP on WMCS network - https://phabricator.wikimedia.org/T315955 (10ayounsi) p:05Triage→03High [05:46:41] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Undocumented IP on WMCS network - https://phabricator.wikimedia.org/T315955 (10ayounsi) [06:29:23] 10netops, 10Cloud-Services, 10Infrastructure-Foundations: Routing loop for unused WMCS IPs in 185.15.56.0/24 - https://phabricator.wikimedia.org/T315956 (10ayounsi) p:05Triage→03Low [09:24:57] 10Traffic, 10SRE, 10Patch-For-Review: ATS Read While Writer feature is wrongly configured - https://phabricator.wikimedia.org/T315911 (10Vgutierrez) = Current Status = The current settings applied when `profile::trafficserver::backend::origin_coalescing` is set to `true` (default value) are: ` CONFIG proxy.... [09:29:02] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Undocumented IP on WMCS network - https://phabricator.wikimedia.org/T315955 (10dcaro) I think that this might be the experiments that we have been doing with Magnum, ping @Andrew, @rook [10:00:03] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Add descriptions to BGP peers - https://phabricator.wikimedia.org/T313805 (10ayounsi) 05Open→03Resolved a:03ayounsi Fixed everywhere I could find any. [10:22:12] 10netops, 10Cloud-Services, 10Infrastructure-Foundations, 10SRE: Undocumented IP on WMCS network - https://phabricator.wikimedia.org/T315955 (10rook) Yeah, this is associated with the testing we're doing with magnum. It's part of 185.15.57.16/29 which was assigned to codfw1dev in T313977 How does one docum... [12:00:13] 10netops, 10Infrastructure-Foundations, 10SRE: Management routers: use BGP instead of OSPF - https://phabricator.wikimedia.org/T294845 (10ayounsi) a:05ayounsi→03None [13:31:21] 10netops, 10Infrastructure-Foundations, 10SRE: Occasional high ICMP probe response from codfw to cr2-drmrs - https://phabricator.wikimedia.org/T315645 (10cmooney) We had a brief discussion about this within Infra Foundations and the consensus is roughly the same, i.e. it doesn't appear the root cause of thes... [13:34:32] 10netops, 10Infrastructure-Foundations, 10SRE: Occasional high ICMP probe response from codfw to cr2-drmrs - https://phabricator.wikimedia.org/T315645 (10Vgutierrez) ack, thanks for checking the issue guys :) [13:35:09] 10netops, 10Infrastructure-Foundations, 10SRE: Occasional high ICMP probe response from codfw to cr1-drmrs - https://phabricator.wikimedia.org/T315645 (10cmooney) [13:56:11] 10netops, 10Infrastructure-Foundations, 10SRE: Occasional high ICMP probe response from codfw to cr1-drmrs - https://phabricator.wikimedia.org/T315645 (10ayounsi) 05Open→03Stalled [14:54:09] moritzm: I am going to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/820654 shortly if no other concerns from you? [14:54:18] (I see the +1 but checking given the change) [14:55:10] perfect! [14:55:15] thanks! [14:58:25] 10netops, 10Cloud Services Proposals, 10Infrastructure-Foundations, 10SRE: Separate WMCS control and management plane traffic - https://phabricator.wikimedia.org/T314847 (10nskaggs) See also https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/notes/Service_predictions_for_cross_realm_situation, {T27... [16:30:16] (VarnishTrafficDrop) firing: Varnish traffic in eqsin has dropped 68.04035261973183% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [16:35:16] (VarnishTrafficDrop) resolved: Varnish traffic in eqsin has dropped 66.45492523035026% - https://wikitech.wikimedia.org/wiki/Varnish - https://grafana.wikimedia.org/d/000000180/varnish-http-requests?viewPanel=6 - https://alerts.wikimedia.org/?q=alertname%3DVarnishTrafficDrop [17:42:08] 10Traffic, 10Data-Engineering, 10SRE: Spike: Investigate creating robust alerts to notify that caching nodes are not sending traffic data - https://phabricator.wikimedia.org/T304651 (10Milimetric) 05Open→03Declined I'm declining this in favor of other work Ben is doing to improve the alert. I think this... [18:05:38] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: Move links to new MPC7E linecard - https://phabricator.wikimedia.org/T304712 (10Papaul) on cr2 interface setup complete ` papaul@re0.cr2-eqiad# run show interfaces terse | match xe-1/1/* xe-1/1/0:0 down down xe-1/1/0:1... [18:07:00] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: eqiad: Move links to new MPC7E linecard - https://phabricator.wikimedia.org/T304712 (10Papaul) @ayounsi everything it ready on the routers to start moving the links. Sorry i am late on this had to finished with the PDU's maintenance. [23:14:49] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: cp5001 memory errors on DIMM A2 - https://phabricator.wikimedia.org/T314256 (10wiki_willy) a:03RobH [23:16:31] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin: cp5001 memory errors on DIMM A2 - https://phabricator.wikimedia.org/T314256 (10wiki_willy) Assigning over to Rob, who's currently working on getting the eqsin hardware refresh ordered.