[00:07:18] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [02:35:09] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2006:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [04:07:18] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:35:09] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2006:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [08:07:18] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:25:48] 07Puppet, 10ORES, 07git-lfs: 14Require git-lfs in ORES hosts - 14https://phabricator.wikimedia.org/T232494#9687224 (10hashar) [08:28:53] 10Packaging, 06Infrastructure-Foundations, 10Scap, 06SRE, and 2 others: 14Install git-lfs client (at least on scap targets & masters) - 14https://phabricator.wikimedia.org/T180628#9687246 (10hashar) [09:14:51] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9687499 (10MoritzMuehlenhoff) [09:54:25] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9687715 (10MoritzMuehlenhoff) [10:13:36] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9687751 (10MoritzMuehlenhoff) [10:38:33] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2006:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [11:52:48] (PuppetZeroResources) firing: Puppet has failed generate resources on idp2003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [12:02:48] (PuppetZeroResources) firing: (2) Puppet has failed generate resources on idp2002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [12:09:10] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:12:48] (PuppetZeroResources) firing: (4) Puppet has failed generate resources on idp1002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [12:22:48] (PuppetZeroResources) resolved: (3) Puppet has failed generate resources on idp1002:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetZeroResources [13:00:24] 10netops, 06Infrastructure-Foundations: eqiad-drmrs transport down (April 2024) - https://phabricator.wikimedia.org/T361825 (10ayounsi) 03NEW [13:00:32] 10netops, 06Infrastructure-Foundations: eqiad-drmrs transport down (April 2024) - https://phabricator.wikimedia.org/T361825#9688138 (10ops-monitoring-bot) ===== Automated diagnostic for Netbox circuit ID 108 --- **Interface cr1-drmrs:xe-0/1/2** - admin-status: up - ⚠️ oper-status: down - interface-flapped:... [13:01:08] 10netops, 06Infrastructure-Foundations: eqiad-drmrs transport down (April 2024) - https://phabricator.wikimedia.org/T361825#9688165 (10ayounsi) Emailed Telxius NOC. [14:39:54] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2006:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [14:49:27] 10netops, 06DC-Ops, 06Infrastructure-Foundations: Take advantage of 10Gb NICs in the new network stack - https://phabricator.wikimedia.org/T360297#9688907 (10cmooney) @ayounsi thanks for the patch! LGTM. Unfortunately I think the approach might not suit in a lot of cases, due to the Trident 3 port-block re... [15:45:22] 10netops, 06Infrastructure-Foundations: eqiad-drmrs transport down (April 2024) - https://phabricator.wikimedia.org/T361825#9689177 (10ops-monitoring-bot) ===== Automated diagnostic for Netbox circuit ID 108 --- **Interface cr1-drmrs:xe-0/1/2** - admin-status: up - ⚠️ oper-status: down - interface-flapped:... [16:12:03] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:52:54] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 10Data-Platform-SRE (2024.03.25 - 2024.04.14): 14create and deploy new Elastic Curator deb package - 14https://phabricator.wikimedia.org/T361105#9689591 (10bking) 05Resolved→03Declined [18:19:46] 10Mail, 06Infrastructure-Foundations, 10MediaWiki-Email, 06SRE: Old "Email this user" email is repeatedly resent - https://phabricator.wikimedia.org/T361860#9690028 (10RLazarus) p:05Triage→03High Clinic duty SRE here -- I/F, can you start investigating this at the MTA end? Triaging this to High in case... [18:39:54] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2006:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [19:24:08] 10netops, 06Infrastructure-Foundations, 10ops-codfw: codfw: use old asw switches from row A and B as msw switches in row C and D - https://phabricator.wikimedia.org/T361871 (10Papaul) 03NEW [20:12:03] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [22:39:54] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2006:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange