[07:31:58] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10fgiunchedi) [07:33:46] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10fgiunchedi) [07:35:23] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10fgiunchedi) [07:37:46] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10fgiunchedi) [07:39:36] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10fgiunchedi) [07:41:26] moritzm, jbond: is ok to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/701442 ? I don't know if anything changed last week [07:44:30] yeah, please go ahead. I'll run some tests with the cookbook today and will also merge the patch to deploy the systemd-logind logout.d wrapper fleet-wide, that'll implicitly also test your patch [07:46:00] great, thanks [09:01:53] moritzm: can I reboot one of the sretest hosts? [09:05:03] I just need to test the reboot cookbook so any will do [09:07:20] sure, those are good to reboot any time [09:07:48] ack thx [09:19:16] {all done} [09:25:53] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [09:34:29] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) @Dwisehaupt I have to apologize for a newbie error I made here. The FR-Tech server's interfaces don't show what they're connected to in Netb... [09:46:10] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Volans) [09:46:22] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [09:50:49] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Volans) [10:01:21] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Volans) [10:16:36] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10cmooney) [10:17:04] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10cmooney) [10:17:26] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10cmooney) [10:19:09] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10cmooney) [10:44:53] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10cmooney) [11:03:21] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Adjust egress buffer allocations on ToR switches - https://phabricator.wikimedia.org/T284592 (10cmooney) @jijiki thanks yes good suggestions both. I will send a mail to ops@ later today as a reminder for people to review. In terms of main... [11:17:08] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10MoritzMuehlenhoff) [11:19:10] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10MoritzMuehlenhoff) [11:20:42] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10MoritzMuehlenhoff) [11:21:48] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10MoritzMuehlenhoff) [11:27:12] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10MoritzMuehlenhoff) [11:45:26] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10Legoktm) [11:48:44] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10cmooney) [11:52:14] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10Kormat) [12:38:51] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10ema) [12:40:57] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10ema) [12:41:22] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10ema) [12:45:04] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ema) [12:46:18] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10ema) [12:49:03] moritzm: committing Seppuku on all hosts? :-P (context SAL) [12:51:34] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ema) [12:52:11] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10ema) [12:59:27] Currently this only triggers on sretest* hosts, but I once test this for real I'll actually need a second Seppuku partner, as otherwise system-logind will also terminate the cookbook process :-) [12:59:59] it would actually be a nice test [13:00:27] to see if it kills the running process too that has clustershell running already with a remote command :D [13:01:22] yeah, I'll first run it with the root user over the mgmt, but will also test that scenario [13:01:51] +1 [13:03:37] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10ssingh) [13:06:16] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10ssingh) [13:12:12] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Vgutierrez) [13:23:49] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10jbond) [13:24:45] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (Kanban): Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10Vgutierrez) [13:26:16] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10jbond) [13:28:28] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10jbond) [13:29:44] thanks :)~. [13:29:51] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10jbond) [13:43:55] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10jbond) [14:19:50] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10Kormat) [16:04:54] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Dwisehaupt) [16:06:45] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Dwisehaupt) @cmooney Not a problem. I have updated the task to remove the pre/post tasks we were looking at. Always good for us to think about failure... [16:26:20] 10puppet-compiler, 10Infrastructure-Foundations, 10User-jbond: PCC dosen't show diff for concat::fragments - https://phabricator.wikimedia.org/T286255 (10jbond) [18:27:41] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10Bstorm) [18:35:28] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10Bstorm) cloudstore servers are both in the same rack. They are a cluster, and it will simply be offline. We will make a task to verify that it c... [18:42:08] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Bstorm) [18:46:55] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Bstorm) [18:48:07] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Bstorm) Switching cloudmetrics to just eat the brief outage. I don't think it will be a big deal. We can just check it after. [18:49:36] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Bstorm) The WMCS-owned dbproxy1018 and 1019 make up the entire cluster, so that's just an outage no matter what for wikireplicas. It will just need do... [18:57:41] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Bstorm) @Gehel I'm not sure who to tag in for cloudelastic here. If there are 2 of them in this row, that could be something that requires attention for redundancy.... [19:08:06] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE, and 2 others: Switch buffer re-partition - Eqiad Row A - https://phabricator.wikimedia.org/T286032 (10Bstorm) [19:08:32] 10netops, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row B - https://phabricator.wikimedia.org/T286061 (10Gehel) Ryan should be around tomorrow to double check, but cloudelastic should be resilient to a row failure. Worst case the service will be down for the duration of... [19:20:36] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, and 2 others: Switch buffer re-partition - Eqiad Row D - https://phabricator.wikimedia.org/T286069 (10Bstorm) [19:30:14] 10netops, 10Analytics, 10DBA, 10Infrastructure-Foundations, 10SRE: Switch buffer re-partition - Eqiad Row C - https://phabricator.wikimedia.org/T286065 (10Bstorm)