[00:58:30] (SystemdUnitFailed) firing: upload_puppet_facts.service Failed on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:58:30] (SystemdUnitFailed) firing: upload_puppet_facts.service Failed on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:14:04] 10SRE-tools, 10Infrastructure-Foundations, 10Puppet-Core, 10SRE, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619 (10MoritzMuehlenhoff) [08:56:54] (SystemdUnitFailed) resolved: upload_puppet_facts.service Failed on puppetserver1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [10:31:16] FYI there are local changes in netbox-dev2002 for the customscripts in netbox-extras. Is something WIP or leftovers? cc XioNoX, topranks [10:31:50] volans: probably testing for https://gerrit.wikimedia.org/r/983268 [10:32:06] volans: yeah it was me testing for that patch [10:32:14] ack no prob then :) [10:32:23] we can remove it if we want, I’d left it there and asked dc ops to take a look at [10:32:37] no no if it's WIP that's fine [10:32:53] ok thanks [10:33:59] (PuppetZeroResources) firing: Puppet has failed generate resources on puppetmaster2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [10:53:59] (PuppetZeroResources) firing: (2) Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:04:00] (PuppetZeroResources) firing: (2) Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:15:36] The PuppetZeroResources check should clear, but those are 80 minutes check, so might take a while. Data looks correct though [11:17:52] can't they be forced to clear out? [11:18:59] (PuppetZeroResources) resolved: (2) Puppet has failed generate resources on puppetmaster1001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [11:19:20] this resolved :D [11:20:09] 10netbox, 10Infrastructure-Foundations, 10IPv6, 10User-jbond: Some clusters do not have DNS for IPv6 addresses (TRACKING TASK) - https://phabricator.wikimedia.org/T253173 (10MoritzMuehlenhoff) >>! In T253173#9375719, @Volans wrote: > @MoritzMuehlenhoff I see that `ganeti[2009-2024]` and `ganeti[1009-1022]`... [11:21:26] You truly are a magical SRE aren't you? Ever present, all knowing and with the ability to summon a "Resolved" :-) [11:24:39] ahahaha [11:25:07] and I did nothing at all, just summened the resolution :D [11:47:45] 10netbox, 10Infrastructure-Foundations, 10IPv6, 10User-jbond: Some clusters do not have DNS for IPv6 addresses (TRACKING TASK) - https://phabricator.wikimedia.org/T253173 (10ayounsi) From `sudo cumin ganeti[1009-1022].eqiad.wmnet 'ip -6 addr | grep "scope global" | grep -v dynamic` (and the same in codfw).... [16:00:16] 10SRE-tools, 10Dumps-Generation, 10Infrastructure-Foundations, 10serviceops, 10IPv6: Some Service Operations clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271142 (10akosiaris) >>! In T271142#9382040, @Volans wrote: > Another datapoint for the mw*/parse* clusters, they will... [16:48:17] 10SRE-tools, 10Dumps-Generation, 10Infrastructure-Foundations, 10serviceops, 10IPv6: Some Service Operations clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271142 (10akosiaris) Regarding the mc* hosts, I 've been mulling over this one for some time now trying to figure out t... [16:50:27] 10SRE-tools, 10Infrastructure-Foundations: Decommission cookbook: lock per switch - https://phabricator.wikimedia.org/T353513 (10Volans) If the delicate part is the call to `configure_switch_interfaces()` we can just change it's signature to require a `lock` [[ https://doc.wikimedia.org/spicerack/master/api/in...