[01:32:50] FIRING: PuppetFailure: Puppet has failed on cumin1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [03:17:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [03:17:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [03:38:25] FIRING: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [03:48:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [05:32:50] FIRING: PuppetFailure: Puppet has failed on cumin1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [06:24:24] FIRING: SystemdUnitFailed: wmf_auto_restart_krb5-admin-server.service on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [06:59:24] RESOLVED: SystemdUnitFailed: wmf_auto_restart_krb5-admin-server.service on krb1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [07:17:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [07:17:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [09:32:50] FIRING: PuppetFailure: Puppet has failed on cumin1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [11:17:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [11:17:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [13:32:50] FIRING: PuppetFailure: Puppet has failed on cumin1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [15:17:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [15:17:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [16:01:25] RESOLVED: MirrorHighLag: Mirrors - /srv/mirrors/ubuntu synchronization lag - https://wikitech.wikimedia.org/wiki/Mirrors - https://grafana.wikimedia.org/d/dbd8a904-eab2-48d1-a3b9-fa1851ef3ed2/mirrors?orgId=1 - https://alerts.wikimedia.org/?q=alertname%3DMirrorHighLag [16:19:51] 07Puppet, 10Beta-Cluster-Infrastructure, 10CirrusSearch, 06Data-Platform-SRE, 06Discovery-Search: Puppet failing on deployment-cirrussearch{12,13,14}.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T393924#10827290 (10bd808) [16:20:29] 07Puppet, 10Beta-Cluster-Infrastructure, 10CirrusSearch, 06Data-Platform-SRE, 06Discovery-Search: Puppet failing on deployment-cirrussearch{12,13,14}.deployment-prep.eqiad1.wikimedia.cloud - https://phabricator.wikimedia.org/T393924#10827295 (10bd808) >>! In T394430#10827113, @dancy wrote: > The failing... [17:32:50] FIRING: PuppetFailure: Puppet has failed on cumin1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [17:50:15] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [17:55:15] RESOLVED: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [18:58:15] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:03:15] RESOLVED: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:17:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [19:17:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts [19:29:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:34:16] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:46:15] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:51:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [19:59:16] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:04:16] RESOLVED: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:05:15] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:10:16] RESOLVED: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:18:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:23:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:25:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:30:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:32:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:37:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:39:15] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:44:15] RESOLVED: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:49:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:54:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [20:55:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:00:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:03:16] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:08:16] RESOLVED: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:09:16] FIRING: ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip6) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:14:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:19:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:32:50] FIRING: PuppetFailure: Puppet has failed on cumin1003:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [21:44:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:49:16] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [21:59:15] RESOLVED: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [22:02:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [22:07:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [22:12:15] FIRING: [2x] ProbeDown: Service idp1004:443 has failed probes (http_idp_wikimedia_org_ip4) - https://wikitech.wikimedia.org/wiki/CAS-SSO#Alerting - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://alerts.wikimedia.org/?q=alertname%3DProbeDown [23:17:35] FIRING: NetboxAccounting: Netbox - Accounting job failed - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxAccounting [23:17:35] FIRING: NetboxPhysicalHosts: Netbox - Report parity errors between PuppetDB and Netbox for physical devices. - https://wikitech.wikimedia.org/wiki/Netbox#Report_Alert - https://netbox.wikimedia.org/core/jobs/ - https://alerts.wikimedia.org/?q=alertname%3DNetboxPhysicalHosts