[05:01:32] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Marostegui) I have switched m3-master from dbproxy1020 to dbproxy1016: https://gerrit.wikimedia.org/r/705789 [05:02:04] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Marostegui) [07:05:34] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10MoritzMuehlenhoff) [07:58:26] 10netops, 10Infrastructure-Foundations, 10SRE, 10Datacenter-Switchover: Record traffic flows in and out of eqiad during switchover - https://phabricator.wikimedia.org/T286038 (10ayounsi) `lang=diff re0.cr2-eqiad# show | compare [edit interfaces xe-3/2/2 unit 0 family inet filter] + output sample-ac... [08:04:34] 10netops, 10Infrastructure-Foundations, 10SRE, 10Datacenter-Switchover: Record traffic flows in and out of eqiad during switchover - https://phabricator.wikimedia.org/T286038 (10ayounsi) Talked to @fgiunchedi on IRC, let us know when to rollback. Ideally before the end of the week so we don't keep "hacks"... [08:11:53] 10netops, 10Infrastructure-Foundations, 10SRE, 10Datacenter-Switchover: Record traffic flows in and out of eqiad during switchover - https://phabricator.wikimedia.org/T286038 (10fgiunchedi) Thank you @ayounsi @cmooney ! Could we keep the sampling for a week straight ? I understand if you are not comforta... [08:25:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10Datacenter-Switchover: Record traffic flows in and out of eqiad during switchover - https://phabricator.wikimedia.org/T286038 (10fgiunchedi) [08:27:19] 10netops, 10Infrastructure-Foundations, 10SRE, 10Datacenter-Switchover, 10User-fgiunchedi: Record traffic flows in and out of eqiad during switchover - https://phabricator.wikimedia.org/T286038 (10fgiunchedi) [08:33:59] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10MoritzMuehlenhoff) [08:35:57] 10netops, 10Infrastructure-Foundations, 10SRE: Adjust egress buffer allocations on ToR switches - https://phabricator.wikimedia.org/T284592 (10cmooney) [08:36:14] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 3 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10cmooney) 05Open→03Resolved [08:51:36] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [08:52:45] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [09:01:07] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [09:01:32] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10cmooney) [09:02:17] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10cmooney) [10:22:29] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10hnowlan) [10:27:45] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10MoritzMuehlenhoff) [10:28:31] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10MoritzMuehlenhoff) [13:06:03] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10fgiunchedi) [13:06:56] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10fgiunchedi) [16:43:59] 10netops, 10Continuous-Integration-Infrastructure, 10DC-Ops, 10Infrastructure-Foundations, 10serviceops: Flapping codfw management alarm ( contint2001.mgmt/SSH is CRITICAL ) - https://phabricator.wikimedia.org/T283582 (10Dzahn) ACKed some more today, gerrit2001.mgmt, wdqs2002.mgmt [17:52:39] 10netops, 10Infrastructure-Foundations, 10SRE, 10fundraising-tech-ops: Automate diff and commit of frack ACL - https://phabricator.wikimedia.org/T260655 (10Jgreen) a:05Jgreen→03None [18:29:50] 10netops, 10Infrastructure-Foundations: cr2-codfw:fpc0 crash - https://phabricator.wikimedia.org/T287110 (10ayounsi) p:05Triage→03High [18:30:59] 10netops, 10Infrastructure-Foundations: cr2-codfw:fpc0 crash - https://phabricator.wikimedia.org/T287110 (10ayounsi) [18:37:41] 10netops, 10Infrastructure-Foundations: cr2-codfw:fpc0 crash - https://phabricator.wikimedia.org/T287110 (10ayounsi) > Case ID 2021-0721-0486 has been created for you. [18:40:50] topranks: Caution: By adding any email address to be cc'd by Juniper Case Manager on notifications relating to a case you certify that the addressee is (i) NOT located in an embargoed country(Iran, Syria, Sudan, Cuba, North Korea), (ii) NOT on the Denied Persons List, Entity List, Unverified List Specially Designated Nationals List or ANY like sanctioned parties list published by the US government or by the European Union [18:41:10] topranks: am I allowed to CC you on the juniper case? [18:41:11] :) [18:41:42] All good I think :) [19:27:50] 10netops, 10Infrastructure-Foundations, 10SRE: cr2-codfw:fpc0 crash - https://phabricator.wikimedia.org/T287110 (10cmooney) First related log I can find referencing FPC (ae interface down logs were before). Jul 21, 2021 @ 17:53:34.000 CMTFPC: Fabric request time out pfe 0 plane 1 pg 0, trying recovery.... [21:53:13] 10SRE-tools, 10DBA, 10Spicerack, 10Datacenter-Switchover: switchdc should verify active/active DBs are read-write in both datacenters - https://phabricator.wikimedia.org/T287129 (10Legoktm) [23:06:04] 10netops, 10Infrastructure-Foundations, 10SRE: cr2-codfw:fpc0 crash - https://phabricator.wikimedia.org/T287110 (10Papaul) Email from JTAC ` Please perform a physical re-seat of the card. Remove it and insert it back into the chassis. If this doesn’t work, we’ll proceed with a replacement. `