[02:13:21] (SystemdUnitFailed) firing: production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:13:21] (SystemdUnitFailed) firing: production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:43:53] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [08:00:48] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 3 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [10:00:42] could I get a quick sanity check for https://gerrit.wikimedia.org/r/c/operations/puppet/+/975209/? (new stub role for fr-tech folks to setup community fundraising CRM) [10:03:57] moritzm: +1 [10:05:31] cheers [10:13:21] (SystemdUnitFailed) firing: production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:24:58] 10Packaging, 10Infrastructure-Foundations, 10cloud-services-team (FY2023/2024-Q1-Q2): wmfbackups packages for Debian Bookworm - https://phabricator.wikimedia.org/T347740 (10jcrespo) Please know an important update of wmfbackups package for compatibility with Puppet 7 will be pushed soon (wmfbackups 0.8.3 - a... [11:42:18] 10netops, 10Infrastructure-Foundations, 10SRE: Support Anycast GW on EVPN switches without unique IP - https://phabricator.wikimedia.org/T350579 (10cmooney) 05Open→03Resolved Patches to support this have been merged and it's working for the codfw row A/B public vlans, closing task. [11:48:24] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) [11:49:41] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) [11:53:21] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) [12:10:10] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Bring codfw row A-B EVPN switches live and make them gateway for existing Vlans - https://phabricator.wikimedia.org/T347191 (10cmooney) public1-a-codfw and public1-b-codfw have gateways have been migrated to the new setup. **Problems** Unfortu... [12:45:59] (PuppetFailure) firing: Puppet has failed on cumin1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [12:50:59] (PuppetFailure) firing: (2) Puppet has failed on cumin1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [13:13:30] (SystemdUnitFailed) firing: (4) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:35:59] (PuppetFailure) firing: (2) Puppet has failed on cumin1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [14:00:59] (PuppetFailure) resolved: Puppet has failed on cumin2002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [14:39:49] 10CAS-SSO, 10Infrastructure-Foundations, 10Patch-For-Review, 10Release-Engineering-Team (Quid Pro Crow 🦃): Correct IDP login page Privacy Policy - https://phabricator.wikimedia.org/T350129 (10Aklapper) ping - could this get a review please? [14:48:09] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate IP gateway for public1-a-codfw to spine switches - https://phabricator.wikimedia.org/T351532 (10cmooney) p:05Triage→03Medium [15:18:09] 10netops, 10Infrastructure-Foundations, 10SRE: Migrate IP gateway for public1-a-codfw to spine switches - https://phabricator.wikimedia.org/T351532 (10cmooney) [15:20:56] 10netops, 10Infrastructure-Foundations, 10SRE: FPC1 Failure on cr1-esams - https://phabricator.wikimedia.org/T351304 (10ayounsi) 05Open→03Resolved Replaced. [16:29:46] 10Packaging, 10Infrastructure-Foundations, 10Phabricator, 10collaboration-services, 10Patch-For-Review: build python-phabricator package for bullseye (and bookworm?) - https://phabricator.wikimedia.org/T351333 (10Dzahn) @Volans Thank you. For some reason I never expected this phabricator specific class w... [17:13:21] (SystemdUnitFailed) firing: (4) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:16:46] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-eqiad: Switch failure: asw2-a8-eqiad Aug 13th 2021 - https://phabricator.wikimedia.org/T288834 (10cmooney) I believe that the bug that caused this has been fixed in 21.4R3-S5 for EX4300 devices. [21:13:21] (SystemdUnitFailed) firing: (4) production-images-weekly-rebuild.service Failed on build2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed