[01:50:09] (PuppetConstantChange) firing: (2) Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [02:32:17] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [03:22:10] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE, 13Patch-For-Review: 14Decom asw-b-codfw switch stack - 14https://phabricator.wikimedia.org/T360776#9678272 (10Papaul) [03:24:00] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE, 13Patch-For-Review: 14Decom asw-b-codfw switch stack - 14https://phabricator.wikimedia.org/T360776#9678273 (10Papaul) 05Open→03Resolved a:03Papaul [05:50:09] (PuppetConstantChange) firing: (2) Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [06:32:17] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:02:43] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9678629 (10MoritzMuehlenhoff) [07:40:45] 10netbox, 10netops, 06Infrastructure-Foundations: Automatically run Capirca Netbox script regularly - https://phabricator.wikimedia.org/T361549 (10taavi) 03NEW [07:47:42] 10netbox, 10netops, 06Infrastructure-Foundations: Automatically run Capirca Netbox script regularly - https://phabricator.wikimedia.org/T361549#9678780 (10ayounsi) Thanks for the task. I was thinking of either a timer or using Netbox's hooks to only run it when relevant changes are done. This however always... [07:51:22] 10SRE-tools, 10Cloud-VPS, 06Infrastructure-Foundations, 10Spicerack: spicerack.puppet.PuppetHostsError: Unable to find CSR fingerprints for all hosts, detected errors are: Another puppet instance is already running and the waitforlock setting is set to 0; e... - https://phabricator.wikimedia.org/T361218#9678784 [08:10:53] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review: gNMI module in Spicerack - https://phabricator.wikimedia.org/T344325#9678820 (10ayounsi) For the record I looked deeper at gNMI to configure Juniper devices. Some of the findings: current code fails with a reply from the switch ab... [08:15:06] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9678829 (10MoritzMuehlenhoff) [09:32:47] should we worry that some mr1 are timing out homer check even at 120s timeout? [09:39:46] not really a worry [09:39:55] there are a few outstanding diffs though [09:41:36] we alreaady increased that in the past [09:41:59] so was worried what might cause the timeout [09:54:53] (PuppetConstantChange) firing: (2) Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [10:32:17] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:44:24] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9679513 (10MoritzMuehlenhoff) [13:01:05] 10homer, 10SRE-tools, 06Infrastructure-Foundations: Homer: add parallelization support - https://phabricator.wikimedia.org/T250415#9679992 (10ayounsi) Ping? :) [13:07:18] 10homer, 10SRE-tools, 06Infrastructure-Foundations: Homer: add parallelization support - https://phabricator.wikimedia.org/T250415#9680010 (10ayounsi) p:05Low→03High [13:55:09] (PuppetConstantChange) firing: (2) Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [14:32:17] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:37:42] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Platform-SRE (2024.03.25 - 2024.04.14): create and deploy new Elastic Curator deb package - https://phabricator.wikimedia.org/T361105#9680514 (10bking) a:05RKemper→03bking [15:44:57] 10netops, 06cloud-services-team, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Move WMCS servers to 1 single NIC - https://phabricator.wikimedia.org/T319184#9681003 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by aborrero@cumin1002 for host cloudvirt1036.eqiad.wmnet... [15:52:23] 10netops, 06cloud-services-team, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Move WMCS servers to 1 single NIC - https://phabricator.wikimedia.org/T319184#9681068 (10aborrero) [16:22:27] 10netops, 06cloud-services-team, 06Infrastructure-Foundations, 06SRE, 13Patch-For-Review: Move WMCS servers to 1 single NIC - https://phabricator.wikimedia.org/T319184#9681261 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by aborrero@cumin1002 for host cloudvirt1036.eqiad.wmnet with... [17:55:09] (PuppetConstantChange) firing: (2) Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [18:32:17] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:11:48] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Platform-SRE (2024.03.25 - 2024.04.14): 14create and deploy new Elastic Curator deb package - 14https://phabricator.wikimedia.org/T361105#9682101 (10bking) 05Open→03Resolved 14While working... [20:12:35] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9682106 (10bking) Per subtask, we no longer need to cut a custom package... [20:17:56] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9682112 (10Volans) @bking I'm not sure what do you mean. As mentioned ear... [20:56:13] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review, 10Puppet (Puppet 7.0): Spicerack puppetserver.destroy() raises an exception when certificate does not exist - https://phabricator.wikimedia.org/T360293#9682181 (10Volans) I've sent a proposal implementation in the patch above [20:56:43] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review, 10Puppet (Puppet 7.0): Spicerack puppetserver.destroy() raises an exception when certificate does not exist - https://phabricator.wikimedia.org/T360293#9682184 (10Volans) a:03Volans [21:17:14] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4): Remove elasticsearch-curator dependency from Elastic cookbooks - https://phabricator.wikimedia.org/T361647 (10bking) 03NEW [21:19:20] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4): Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks - https://phabricator.wikimedia.org/T361647#9682261 (10bking) a:05RKemper→03None [21:19:57] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9682264 (10bking) Oops, thank you for pointing that out. I'm discussing... [21:55:09] (PuppetConstantChange) firing: (2) Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [22:32:18] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed