[05:58:58] (SystemdUnitFailed) firing: httpbb_hourly_appserver.service Failed on cumin2002:9100- https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:58:58] (SystemdUnitFailed) resolved: httpbb_hourly_appserver.service Failed on cumin2002:9100- https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:05:43] 10Puppet, 10SRE: run-puppet-agent --quiet fails - https://phabricator.wikimedia.org/T345548 (10Volans) p:05Triage→03High [10:13:28] 10Puppet, 10SRE, 10Patch-For-Review: run-puppet-agent --quiet fails - https://phabricator.wikimedia.org/T345548 (10Volans) [10:14:08] if anyone has a moment to unblock a bunch of workflows relying on run-puppet-agent --quiet: https://gerrit.wikimedia.org/r/c/operations/puppet/+/954609 [10:15:06] I'll have a look in ~5m [10:15:30] thx! [10:18:41] moritzm: john got there first, no need, thx again [10:20:05] too late, just +1d :-) [10:20:22] :D [10:54:02] I'm trying to build a Debian package, with documentation, but I can't see to figure out why dpkg-buildpackage complains about not being able to locate README.Debian, which is in the debian dir, but not in debian/tmp where dh_installdocs seems to think it should be (I think that's a fallback) [10:57:04] Nevermind, the build system is slightly weird. dh_make will put a template README.Debian in debian, but dh_installdoc expects it to be one directory up [11:14:13] 10SRE-tools, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Setup zero touch provisioning (ZTP) for network devices - https://phabricator.wikimedia.org/T336485 (10cmooney) I've put a very brief summary of using the cookbook on Wikitech here: https://wikitech.wikimedia.org/wiki/ZTP_Ne... [11:15:57] 10SRE-tools, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review, 10Puppet (Puppet 7.0): Cumin: update config to use new puppet7 infrastructure - https://phabricator.wikimedia.org/T341497 (10jbond) 05Open→03Resolved a:03jbond this has been completed [11:16:47] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 3 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10jbond) [11:17:28] 10SRE-tools, 10netbox, 10Infrastructure-Foundations, 10Puppet-Infrastructure, and 3 others: update systems to use new puppetdb instance - https://phabricator.wikimedia.org/T342214 (10jbond) 05In progress→03Resolved a:03jbond This is now in place [11:36:55] 10Puppet, 10SRE: run-puppet-agent --quiet fails - https://phabricator.wikimedia.org/T345548 (10Volans) 05Open→03Resolved Change has been merged and by now deployed everywhere. Resolving. [11:41:28] I'm about to reboot the netbox hosts, or is it currently a bad time for anyone? [11:53:13] I take that as a "good to do" :-) starting in a few [12:11:58] moritzm: I take it that's ongoing [12:12:09] ping me when done if you can [12:17:38] topranks: you can go ahead, the last reboot (1002) is mostly complete, the cookbook is only stalling since the various failing Netbox coherence reports prevent a full service recovry [12:19:38] moritzm: thanks yep - working fine for me now :) [12:21:37] cool! I still need to reboot the Netbox databases, but that's for tomorrow [15:04:17] 10netops, 10Infrastructure-Foundations, 10SRE: TLS certificates for network devices - https://phabricator.wikimedia.org/T334594 (10ayounsi) 05Open→03Resolved a:03ayounsi This is now working in prod. [15:04:25] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Add Dell switches support to Homer/Cookbooks - https://phabricator.wikimedia.org/T320638 (10ayounsi) [16:38:19] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney)