[00:02:11] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11338235 (10Papaul) [00:04:38] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11338243 (10Papaul) @cmooney thanks for the feedback we can clarify this tomorrow during the meeting and have all ready and run it by @ayounsi when he is back. [00:07:51] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11338254 (10Papaul) [00:10:21] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-ulsfo, 06SRE: ULSFO: New switch configuration - https://phabricator.wikimedia.org/T408892#11338258 (10Papaul) [00:53:25] FIRING: [4x] GanetiCACertificateAboutToExpire: Ganeti CA certificate ganeti.example.com is about to expire - https://wikitech.wikimedia.org/wiki/Ganeti#Renew_cluster_certificates - TODO - https://alerts.wikimedia.org/?q=alertname%3DGanetiCACertificateAboutToExpire [01:11:43] FIRING: [2x] NodeTextfileStale: Stale textfile for config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [01:11:44] FIRING: [2x] NodeTextfileStale: Stale textfile for puppetmaster1001:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [01:44:40] 10Mail, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations: lists.wikimedia.org subscription email rejected by DKIM - https://phabricator.wikimedia.org/T409137 (10DamianZaremba) 03NEW [01:58:42] 10Mail, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: lists.wikimedia.org subscription email rejected by DKIM - https://phabricator.wikimedia.org/T409137#11338499 (10DamianZaremba) [01:59:25] 10Mail, 06cloud-services-team, 10Cloud-VPS, 06Infrastructure-Foundations, 06SRE: lists.wikimedia.org subscription email rejected by DKIM - https://phabricator.wikimedia.org/T409137#11338500 (10DamianZaremba) Tagging SRE as not sure which team is responsible. [04:53:25] FIRING: [4x] GanetiCACertificateAboutToExpire: Ganeti CA certificate ganeti.example.com is about to expire - https://wikitech.wikimedia.org/wiki/Ganeti#Renew_cluster_certificates - TODO - https://alerts.wikimedia.org/?q=alertname%3DGanetiCACertificateAboutToExpire [05:11:43] FIRING: [2x] NodeTextfileStale: Stale textfile for puppetmaster1001:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [05:11:44] FIRING: [2x] NodeTextfileStale: Stale textfile for config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [08:02:25] FIRING: SystemdUnitFailed: update-ubuntu-mirror.service on mirror1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:32:25] FIRING: [2x] SystemdUnitFailed: user@11984.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:33:24] ^ the build alert is caused by the terrible JDK test suite, I'm building Bullseye/Bookworm forward ports of the latest OpenJDK8 security release [08:42:25] FIRING: [2x] SystemdUnitFailed: user@11984.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:53:25] FIRING: [4x] GanetiCACertificateAboutToExpire: Ganeti CA certificate ganeti.example.com is about to expire - https://wikitech.wikimedia.org/wiki/Ganeti#Renew_cluster_certificates - TODO - https://alerts.wikimedia.org/?q=alertname%3DGanetiCACertificateAboutToExpire [08:57:25] FIRING: [2x] SystemdUnitFailed: user@11984.service on build2002:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [09:11:44] FIRING: [2x] NodeTextfileStale: Stale textfile for config-master1001:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale [09:11:48] FIRING: [2x] NodeTextfileStale: Stale textfile for puppetmaster1001:9100 - https://wikitech.wikimedia.org/wiki/Prometheus#Stale_file_for_node-exporter_textfile - https://grafana.wikimedia.org/d/knkl4dCWz/node-exporter-textfile - https://alerts.wikimedia.org/?q=alertname%3DNodeTextfileStale