[06:40:34] 10CAS-SSO, 06Infrastructure-Foundations, 06SRE: Update CAS to 7.0 - https://phabricator.wikimedia.org/T367487 (10MoritzMuehlenhoff) 03NEW [08:15:47] FIRING: SystemdUnitFailed: wmf_auto_restart_exim4.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:29:15] 10CAS-SSO, 06Infrastructure-Foundations, 06SRE: Update CAS to 7.0 - https://phabricator.wikimedia.org/T367487#9891993 (10SLyngshede-WMF) I've run a test build, Java 21 is a hard requirement, it cannot be older or newer. Otherwise the overlay upgrade contains only minor changes. I have not tested the function... [12:14:33] 10CAS-SSO, 06Infrastructure-Foundations, 06SRE: Update CAS to 7.0 - https://phabricator.wikimedia.org/T367487#9892110 (10MoritzMuehlenhoff) I'll look into a Java 21 backport for Bookworm. [12:15:47] FIRING: SystemdUnitFailed: wmf_auto_restart_exim4.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [12:58:46] FIRING: [2x] SystemdUnitFailed: gitlab-package-puller.service on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:00:57] 10SRE-tools, 06DC-Ops, 06Infrastructure-Foundations, 10Spicerack, 13Patch-For-Review: Spicerack: expand Supermicro support in the Redfish module - https://phabricator.wikimedia.org/T365372#9892336 (10elukey) Note for me - this is an example of snippet generated by the provision cookbook to instruct the D... [13:06:31] 10SRE-tools, 06Infrastructure-Foundations, 10Observability-Alerting, 10Spicerack: sre.hosts.downtime, and any other maintenance processes, should use auto-extending silences - https://phabricator.wikimedia.org/T367466#9892352 (10MatthewVernon) I had not heard of that tool, but it does sound useful! [13:22:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Get test host connected to codfw row c/d lsw's - https://phabricator.wikimedia.org/T367512 (10cmooney) 03NEW p:05Triage→03Medium [13:22:12] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Codfw row C/D switch installation & configuration - https://phabricator.wikimedia.org/T364095#9892418 (10cmooney) [13:22:13] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: Get test host connected to codfw row c/d lsw's - https://phabricator.wikimedia.org/T367512#9892417 (10cmooney) [13:23:45] RESOLVED: SystemdUnitFailed: wmf_auto_restart_exim4.service on mx-in1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:49:25] folks I found this in sretest1001's /etc/network/interfaces: [13:49:26] auto ipip0 [13:49:26] iface ipip0 inet tunnel mode ipip [13:49:30] [..] [13:49:51] the networking service is failing, was there any test in-progress? [13:50:02] If not I'll clean up the file and restart the service [13:51:52] {{done{{ [14:03:46] 10Mail, 06collaboration-services, 06Infrastructure-Foundations, 06SRE: Update SPF records as needed - https://phabricator.wikimedia.org/T366113#9892614 (10jhathaway) 05Open→03Resolved spf records updated [14:04:55] elukey: maybe related to Valentin's previous testing ? [14:09:12] probably yes! All cleaned out! [14:21:54] 10Mail, 06Infrastructure-Foundations, 06SRE: Postfix inbound rollout sequence, mx-in - https://phabricator.wikimedia.org/T367517 (10jhathaway) 03NEW [14:25:33] 10Mail, 06Infrastructure-Foundations, 06SRE: Postfix inbound rollout sequence, mx-in - https://phabricator.wikimedia.org/T367517#9892687 (10jhathaway) [15:11:30] 10Mail, 06Infrastructure-Foundations, 06SRE: Postfix inbound rollout sequence, mx-in - https://phabricator.wikimedia.org/T367517#9893386 (10jhathaway) [15:29:25] 10Mail, 06Infrastructure-Foundations, 06SRE: Postfix inbound rollout sequence, mx-in - https://phabricator.wikimedia.org/T367517#9893459 (10jhathaway) [15:55:42] 10Mail, 10fundraising-tech-ops, 06Infrastructure-Foundations, 06SRE: Update fundraising mail / firewall settings to use new production mx-in hosts - https://phabricator.wikimedia.org/T367573 (10Dwisehaupt) 03NEW [16:58:05] 10netops, 06Infrastructure-Foundations, 06SRE: Move asw-c-codfw and asw-d-codfw CR uplinks to Spine switches - https://phabricator.wikimedia.org/T366941#9893856 (10cmooney) [19:00:04] 10Mail, 06Infrastructure-Foundations, 06SRE: Postfix inbound rollout sequence, mx-in - https://phabricator.wikimedia.org/T367517#9894300 (10jhathaway) [21:58:46] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed