[00:55:49] (PuppetFailure) firing: Puppet has failed on mx-out2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [04:55:49] (PuppetFailure) firing: Puppet has failed on mx-out2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [06:51:55] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9696249 (10SLyngshede-WMF) 05Open→03In progress a:03SLyngshede-WMF [08:55:49] (PuppetFailure) firing: Puppet has failed on mx-out2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [09:10:06] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9696567 (10MoritzMuehlenhoff) [09:19:43] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: use old asw switches from row A and B as msw switches in row C and D - https://phabricator.wikimedia.org/T361871#9696600 (10ayounsi) Thanks. What I don't understand is that if they go through ZTP or manual basic setup, they will by definiti... [09:55:23] 10SRE-tools, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9696705 (10MoritzMuehlenhoff) [11:13:26] FYI, upgrading postgres in netboxdb hosts in a few [11:15:17] ack [11:21:07] all done [11:21:25] (SystemdUnitFailed) firing: netbox_report_cables_run.service on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:21:30] perfec [11:31:25] (SystemdUnitFailed) resolved: netbox_report_cables_run.service on netbox1002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:55:49] (PuppetFailure) firing: Puppet has failed on mx-out2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [13:17:06] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: use old asw switches from row A and B as msw switches in row C and D - https://phabricator.wikimedia.org/T361871#9697413 (10Papaul) @ayounsi yes you are right since it will have an IP address it will be managed so I was thinking over it. Di... [14:21:25] (SystemdUnitFailed) firing: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [14:28:11] 10netbox, 10netops, 06Infrastructure-Foundations: Automatically run Capirca Netbox script regularly - https://phabricator.wikimedia.org/T361549#9697755 (10ayounsi) p:05Triage→03Medium [14:28:17] 10netbox, 06Infrastructure-Foundations: Netbox: capirca.getHosts script runs into timeout - https://phabricator.wikimedia.org/T358339#9697757 (10ayounsi) [14:28:19] 10netbox, 10netops, 06Infrastructure-Foundations: Automatically run Capirca Netbox script regularly - https://phabricator.wikimedia.org/T361549#9697756 (10ayounsi) [14:32:12] 10SRE-tools, 10Cloud-VPS, 06Infrastructure-Foundations, 10Spicerack: spicerack.puppet.PuppetHostsError: Unable to find CSR fingerprints for all hosts, detected errors are: Another puppet instance is already running and the waitforlock setting is set to 0; e... - https://phabricator.wikimedia.org/T361218#9697786 [14:36:47] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: Remove elasticsearch-curator dependency from Spicerack/Elastic cookbooks - https://phabricator.wikimedia.org/T361647#9697825 (10Volans) p:05Triage→03Medium a:03Volans [14:36:55] 10netbox, 10ChangeProp, 06collaboration-services, 10GitLab, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#9697832 (10joanna_borun) p:05Triage→03Medium [14:39:37] 10SRE-tools, 06Discovery-Search, 06SRE: Create cookbook to reindex into elasticsearch / cirrus - https://phabricator.wikimedia.org/T219507#9697839 (10joanna_borun) [14:48:37] 10netbox, 10SRE-tools, 06Infrastructure-Foundations: 14netbox dumps: fix permissions and timestamp - 14https://phabricator.wikimedia.org/T260077#9697877 (10Volans) 05Open→03Resolved a:03Volans 14Since the last update we've removed the Netbox CSV dumps all-together. Resolving [14:51:08] 10CAS-SSO, 06Infrastructure-Foundations, 06SRE: IDP failover improvments - https://phabricator.wikimedia.org/T268217#9697885 (10MoritzMuehlenhoff) p:05Medium→03Low [14:51:25] 10netops, 10fundraising-tech-ops, 06Infrastructure-Foundations, 06SRE: Manage frack switches with Netbox - https://phabricator.wikimedia.org/T268802#9697900 (10joanna_borun) p:05Medium→03Low [14:53:41] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations: Create a PDU spicerack module - https://phabricator.wikimedia.org/T263018#9697915 (10joanna_borun) p:05Medium→03Low [14:55:22] 10SRE-tools, 06Infrastructure-Foundations: Netbox accounting report: make it more reliable - https://phabricator.wikimedia.org/T260325#9697927 (10joanna_borun) p:05Medium→03Low [14:55:28] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: Generate ssh_known_hosts for network devices - https://phabricator.wikimedia.org/T252747#9697928 (10ayounsi) [14:55:33] 10netbox, 06Infrastructure-Foundations, 13Patch-For-Review: Upgrade Netbox to 4.x - https://phabricator.wikimedia.org/T336275#9697929 (10ayounsi) [14:56:21] 10SRE-tools, 10Ganeti, 06Infrastructure-Foundations, 06SRE: Cookbook to failover the Ganeti master - https://phabricator.wikimedia.org/T283320#9697930 (10MoritzMuehlenhoff) p:05Medium→03Low [14:58:15] 10SRE-tools, 06Infrastructure-Foundations: Debmonitor: backend-changeable settings are stored in the browser's session storage - https://phabricator.wikimedia.org/T240457#9697956 (10joanna_borun) p:05Medium→03Low [15:04:45] 10SRE-tools, 06Infrastructure-Foundations, 06serviceops: Switchdc RO/RW: add check to test it editing a real wiki - https://phabricator.wikimedia.org/T163365#9697994 (10joanna_borun) [15:08:14] 10SRE-tools, 06Infrastructure-Foundations, 06SRE: 14wmf-auto-reimage: 'execution expired' on first puppet run - 14https://phabricator.wikimedia.org/T201317#9698001 (10Volans) 05Open→03Declined 14Too long has passed since then and doesn't seem to happen anymore. [15:10:19] 10Mail, 10DNS, 06Infrastructure-Foundations, 06SRE: 14Set up role accounts and feedback loops (FBL) with all providers - 14https://phabricator.wikimedia.org/T106664#9698010 (10joanna_borun) 05Open→03Invalid [15:10:29] 10Mail, 06Infrastructure-Foundations, 06SRE: 14Get mail relay out of Yahoo! blacklist: apply to Yahoo for whitelisting bulk mail - 14https://phabricator.wikimedia.org/T58414#9698012 (10joanna_borun) [15:10:49] (PuppetFailure) resolved: Puppet has failed on mx-out2001:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [15:23:28] 10Mail, 06Infrastructure-Foundations, 06SRE: 14Do not apply spam headers on email assessed NOT to be spam - 14https://phabricator.wikimedia.org/T111595#9698089 (10jhathaway) 05Open→03Declined 14@bcampbell setting this to declined, please reopen, if this is still a concern [16:04:36] 10netbox, 10ChangeProp, 06collaboration-services, 10GitLab, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#9698336 (10CodeReviewBot) bd808 merged https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/28 Re... [18:21:25] (SystemdUnitFailed) firing: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [20:21:25] (SystemdUnitFailed) resolved: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed