[09:02:48] volans: https://gerrit.wikimedia.org/r/c/operations/puppet/+/723064/ should fix it, I'll merge and then force puppet runs on install*, then you can re-try the reimage of sretest1002 [09:03:25] sorry, in a meeting [09:09:16] no hurry, just an FYI :-) [10:08:15] confirmed with a reinstall of mx2002 that bullseye installs are working fine again [10:24:33] yay ,I'll reinstall sretest1002 then [11:55:51] 10netops, 10Infrastructure-Foundations, 10SRE: Adjust egress buffer allocations on ToR switches - https://phabricator.wikimedia.org/T284592 (10cmooney) Ok well we're about a week after DC switchover back to eqiad so we can make some conclusions on the results of the changes in eqiad. Overall there definitel... [12:01:16] 10netops, 10Data-Persistence-Backup, 10Infrastructure-Foundations, 10SRE, 10bacula: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10cmooney) @jcrespo thanks for the above comments. In terms of... [12:02:49] 10netops, 10Infrastructure-Foundations: Packet Drops on Eqiad ASW -> CR uplinks - https://phabricator.wikimedia.org/T291627 (10cmooney) [12:03:20] 10netops, 10Infrastructure-Foundations, 10SRE: Adjust egress buffer allocations on ToR switches - https://phabricator.wikimedia.org/T284592 (10cmooney) [12:03:28] 10netops, 10Infrastructure-Foundations: Packet Drops on Eqiad ASW -> CR uplinks - https://phabricator.wikimedia.org/T291627 (10cmooney) [12:03:36] 10netops, 10Infrastructure-Foundations: Packet Drops on Eqiad ASW -> CR uplinks - https://phabricator.wikimedia.org/T291627 (10cmooney) p:05Triage→03High [12:14:01] 10netops, 10Infrastructure-Foundations: Packet Drops on Eqiad ASW -> CR uplinks - https://phabricator.wikimedia.org/T291627 (10cmooney) In terms of further mitigation one thing we could possibly do in the short-term is to change how we configure our VRRP states. Currently we configure VRRP primary/backup stat... [12:18:27] 10netops, 10Infrastructure-Foundations: Packet Drops on Eqiad ASW -> CR uplinks - https://phabricator.wikimedia.org/T291627 (10cmooney) Another change that could help here would be to move the L3 gateway for hosts to the virtual-chassis. i.e.: - Set up new, routed sub-interfaces between the ASWs and CRs. - A... [13:37:53] 10netops, 10Data-Persistence-Backup, 10Infrastructure-Foundations, 10SRE, 10bacula: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10jcrespo) @cmooney Please feel free to resolve this ticket and... [14:37:09] 10Mail, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Upgrade MXes to Bullseye - https://phabricator.wikimedia.org/T286911 (10MoritzMuehlenhoff) Both mx1001 and mx2001 are now running Bullseye. There's a little cleanup/followup work, but the core of the work is completed. [16:34:47] 10netops, 10Infrastructure-Foundations, 10SRE: Create an alert for output discards on network devices - https://phabricator.wikimedia.org/T284593 (10cmooney) [16:34:55] 10netops, 10Infrastructure-Foundations, 10SRE: Packet Drops on Eqiad ASW -> CR uplinks - https://phabricator.wikimedia.org/T291627 (10cmooney) [16:39:07] 10netops, 10Data-Persistence-Backup, 10Infrastructure-Foundations, 10SRE, 10bacula: Understand (and mitigate) the backup speed differences between backup1002->backup2002 and backup2002->backup1002 - https://phabricator.wikimedia.org/T274234 (10cmooney) 05Open→03Resolved @jcrespo thanks. As you say i... [17:06:41] 10SRE-tools, 10Infrastructure-Foundations: Cookbooks: convert wmf-auto-reimage scripts to Cookbooks - https://phabricator.wikimedia.org/T205885 (10ops-monitoring-bot) Cookbook cookbooks.sre.experimental.reimage was started by volans@cumin2002 for host sretest1001.eqiad.wmnet [17:31:39] 10SRE-tools, 10Infrastructure-Foundations: Cookbooks: convert wmf-auto-reimage scripts to Cookbooks - https://phabricator.wikimedia.org/T205885 (10ops-monitoring-bot) Cookbook cookbooks.sre.experimental.reimage completed: - sretest1001 (**PASS**) - Downtimed on Icinga - Disabled Puppet - Removed from Pup...