[00:07:13] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:07:13] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:41:59] (SystemdUnitFailed) firing: (2) update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:42:13] (SystemdUnitFailed) firing: (2) update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:06:24] 10netops, 06Infrastructure-Foundations, 06SRE, 07Epic: [tracking] Don't keep on the public vlans hosts that don't require it - https://phabricator.wikimedia.org/T317177#9644950 (10Gehel) [09:12:12] moritzm: quick question, does using `systemd::mask {}` removes the need for `service '' { ensure => stopped }` ? [09:12:32] or I need to stop + mask [09:12:33] ? [09:15:02] mask ensures that a service isn't automatically started, if by means of a package installation or other actions, the service is already running, then it will remain running until next reboot, so in that scenario you'll also need to stop it [09:15:21] but if it's for provisioning a new thing, then mask should be enough [09:19:48] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9644974 (10MoritzMuehlenhoff) [09:20:21] moritzm: ok, makes sens, but provisionning a new thing means that it might be started by the package init script at install time, right ? [09:21:11] it depends. we also have some roles (e.g. the caches) which add the systemctl::mask intentionally before the package gets installed. in this case you don't need to stop it [09:21:50] but if it's e.g. installed as part of the early debian installer (IOW before Puppet), then service=>stopped is needed [09:22:09] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9644977 (10SLyngshede-WMF) [09:26:29] ah ok, I didn't know that was possible, sounds like the best path, thx :) CR updated [09:39:52] fyi, I'm going to start a re-image of sretest2003 [09:39:55] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9645031 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1002 for host idp-test1003.wikimedia.org with OS bullseye [10:07:24] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9645135 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1002 for host idp-test1003.wikimedia.org with OS bullseye completed: - idp-test1003 (... [10:41:59] (SystemdUnitFailed) firing: (2) update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:20:27] 10CAS-SSO, 06Infrastructure-Foundations: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9645472 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1002 for host idp-test2002.wikimedia.org with OS bookworm [12:59:20] 10CAS-SSO, 06Infrastructure-Foundations: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9645541 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1002 for host idp-test2002.wikimedia.org with OS bookworm completed: - idp-test2002 (**PASS**) - Downtime... [13:29:48] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [13:58:46] 10CAS-SSO, 06Infrastructure-Foundations, 13Patch-For-Review: Migrate CAS to Bookworm - https://phabricator.wikimedia.org/T357748#9645687 (10MoritzMuehlenhoff) [13:58:52] 10CAS-SSO, 06Infrastructure-Foundations: 14Build Tomcat 9 for Bookworm - 14https://phabricator.wikimedia.org/T359333#9645685 (10MoritzMuehlenhoff) 05Open→03Resolved 14Tomcat was backported and is running on the bookworm-based IDP test nodes. [14:08:31] 07Puppet, 10Wikidata, 06Wikidata Dev Team, 10wmde-wikidata-tech, and 2 others: Remove the WDCM clone (stats1007) - https://phabricator.wikimedia.org/T351072#9645712 (10AndrewTavis_WMDE) [14:10:40] 07Puppet, 10Wikidata, 06Wikidata Dev Team, 10wmde-wikidata-tech, and 2 others: Remove the WDCM clone (stats1007) - https://phabricator.wikimedia.org/T351072#9645716 (10karapayneWMDE) Notes for Wikidata Dev Team: task needs like 30 mins of sync between a DOT team member (likely @Lucas_Werkmeister_WMDE ) and... [14:42:13] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [15:33:05] 10SRE-tools, 06Infrastructure-Foundations, 10Spicerack, 10cloud-services-team (FY2023/2024-Q3-Q4), 13Patch-For-Review: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337#9646108 (10bking) Unfortunately, we are plus the likelihood that there wi... [17:29:49] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [18:31:43] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9646986 (10Papaul) [18:32:51] 10netops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Decom asw-a-codfw switch stack - https://phabricator.wikimedia.org/T358244#9646991 (10Papaul) Removed all old cables and unracked 4 switches out of 8 [18:42:14] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [18:50:28] 10Mail, 06Infrastructure-Foundations, 06SRE: Access to DMARCIAN - https://phabricator.wikimedia.org/T356920#9647061 (10Jgreen) [18:51:10] 10Mail, 10fundraising-tech-ops: DMarc Email Address for Wikimedia.org - https://phabricator.wikimedia.org/T316899#9647067 (10Jgreen) [18:51:21] 10Mail, 06Infrastructure-Foundations, 06SRE: Access to DMARCIAN - https://phabricator.wikimedia.org/T356920#9647066 (10Jgreen) [21:30:04] (PuppetConstantChange) firing: Puppet performing a change on every puppet run on testvm2005:9100 - https://puppetboard.wikimedia.org/nodes?status=changed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetConstantChange [22:42:14] (SystemdUnitFailed) firing: generate_os_reports.service on puppetdb2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed