[01:51:49] FIRING: PuppetDisabled: Puppet disabled on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=misc&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [05:51:49] FIRING: PuppetDisabled: Puppet disabled on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=misc&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [09:51:49] FIRING: PuppetDisabled: Puppet disabled on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=misc&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [11:11:00] 10netops, 06Infrastructure-Foundations, 06SRE: ganeti1025 VMs unresponsive Nov 1 2024 - https://phabricator.wikimedia.org/T378809#10288046 (10MoritzMuehlenhoff) >>! In T378809#10284244, @cmooney wrote: >>>! In T378809#10284231, @CDanis wrote: >> I'm pretty confident this is the same as T348730, and I thi... [11:22:25] FIRING: SystemdUnitFailed: prometheus-ganeti-exporter.service on ganeti1040:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:27:25] FIRING: [2x] SystemdUnitFailed: prometheus-ganeti-exporter.service on ganeti1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:37:25] FIRING: [2x] SystemdUnitFailed: prometheus-ganeti-exporter.service on ganeti1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [11:42:25] RESOLVED: [2x] SystemdUnitFailed: prometheus-ganeti-exporter.service on ganeti1039:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [13:36:49] RESOLVED: PuppetDisabled: Puppet disabled on apt-staging2001:9100 - https://wikitech.wikimedia.org/wiki/Puppet/Runbooks#Puppet_Disabled - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet?var-cluster=misc&viewPanel=14 - https://alerts.wikimedia.org/?q=alertname%3DPuppetDisabled [15:40:23] 10netops, 06Infrastructure-Foundations: Testing liberica with ncredir@eqiad - https://phabricator.wikimedia.org/T378453#10289303 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs1013.eqiad.wmnet with OS bookworm [16:08:21] 10netops, 06Infrastructure-Foundations: Testing liberica with ncredir@eqiad - https://phabricator.wikimedia.org/T378453#10289482 (10CDanis) p:05Triage→03Medium [16:59:59] 10netops, 06Infrastructure-Foundations: Testing liberica with ncredir@eqiad - https://phabricator.wikimedia.org/T378453#10289760 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs1013.eqiad.wmnet with OS bookworm executed with errors: - lvs1013 (**FAIL**)... [17:16:40] slyngs: are you working on idp right now? [17:16:56] No, what's up? [17:17:07] idp.wm.o is serving an Envoy error upstream connect error or disconnect/reset before headers. reset reason: connection failure [17:17:19] I'll take a look [17:18:59] ah maybe moritzm was working on it? [17:19:16] Ah, Envoys not working [17:20:30] Hmm, Puppet is disabled on the main node [17:20:38] slyngs: see -sre as well [17:20:46] yeah, by moritz [17:44:00] 10netops, 06Infrastructure-Foundations: Testing liberica with ncredir@eqiad - https://phabricator.wikimedia.org/T378453#10290068 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by vgutierrez@cumin1002 for host lvs1013.eqiad.wmnet with OS bookworm [18:25:43] 10netops, 06Infrastructure-Foundations: Testing liberica with ncredir@eqiad - https://phabricator.wikimedia.org/T378453#10290288 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by vgutierrez@cumin1002 for host lvs1013.eqiad.wmnet with OS bookworm executed with errors: - lvs1013 (**FAIL**)... [23:05:17] 10Mail, 06Infrastructure-Foundations, 06Trust-and-Safety: Emails from wikimediats.zendesk.com fails DMARC policy - https://phabricator.wikimedia.org/T378285#10291105 (10revi)