[00:29:43] 10SRE-tools, 10Spicerack: Support cookbooks resume after user interruption - https://phabricator.wikimedia.org/T345402 (10Fabfur) [03:33:57] (SystemdUnitFailed) firing: sync-puppet-volatile.service Failed on puppetmaster2001:9100- https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:48:57] (SystemdUnitFailed) resolved: sync-puppet-volatile.service Failed on puppetmaster2001:9100- https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:58:58] 10Puppet, 10netbox, 10Infrastructure-Foundations, 10SRE, and 3 others: Netbox: use the netbox to also sync networks and network devices - https://phabricator.wikimedia.org/T329272 (10cmooney) >>! In T329272#9129584, @ayounsi wrote: > I'm curious to know what @cmooney thinks about removing parent/child for... [11:09:21] 10SRE-tools, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: Setup zero touch provisioning (ZTP) for network devices - https://phabricator.wikimedia.org/T336485 (10cmooney) Just an update. The cookbook is now working to both add the initial configuration and upgrade/downgrade the devi... [13:22:24] 10Puppet, 10netbox, 10Infrastructure-Foundations, 10SRE, and 3 others: Netbox: use the netbox to also sync networks and network devices - https://phabricator.wikimedia.org/T329272 (10fgiunchedi) >>! In T329272#9129584, @ayounsi wrote: > I'm also curious to know @fgiunchedi if/how alertmanager handles it.... [13:28:14] 10SRE-tools, 10Infrastructure-Foundations, 10Spicerack: Spicerack: don't IRC log start/stop of cookbook - https://phabricator.wikimedia.org/T324655 (10fnegri) We have quite a few cookbooks in WMCS that are read-only and used to show a cluster status or similar things. Logging to SAL every time someone runs o... [16:28:57] (SystemdUnitFailed) firing: apache2.service Failed on config-master2001:9100- https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:34:14] Does anyone here keep up with config-master hosts other than jbond? 2001 was a bit broken and I just broke it some more https://phabricator.wikimedia.org/T345452 [16:35:29] * jbond looking [16:43:57] (SystemdUnitFailed) resolved: apache2.service Failed on config-master2001:9100- https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:46:18] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney) @papaul thanks for the work documenting the cable IDs. I've put the ones from above in Netbox now. There is one discrepancy, the same label is listed for two... [21:54:57] 10netops, 10Infrastructure-Foundations, 10SRE, 10ops-codfw: Upgrade new codfw switches to Juniper recommended - https://phabricator.wikimedia.org/T341670 (10cmooney) 05Open→03Resolved a:03cmooney All are now upgraded to JUNOS 22.2R3.15. I used the opportunity to test the ZTP cookbook which is workin... [21:55:05] 10netops, 10Infrastructure-Foundations, 10SRE: TLS certificates for network devices - https://phabricator.wikimedia.org/T334594 (10cmooney) [21:55:13] 10netops, 10Infrastructure-Foundations, 10SRE: Plan codfw row A/B top-of-rack switch refresh - https://phabricator.wikimedia.org/T327938 (10cmooney)