[00:49:25] FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logging-sd1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [04:49:40] FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logging-sd1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:49:40] FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logging-sd1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [08:58:41] FIRING: PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance prometheus1005:9900) - https://wikitech.wikimedia.org/wiki/Prometheus - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=eqiad%20prometheus%2Fops - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [09:03:41] FIRING: [2x] PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance prometheus1005:9900) - https://wikitech.wikimedia.org/wiki/Prometheus - https://grafana.wikimedia.org/d/GWvEXWDZk/prometheus-server?var-datasource=eqiad%20prometheus%2Fops - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [09:08:41] FIRING: [3x] PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance prometheus1005:9900) - https://wikitech.wikimedia.org/wiki/Prometheus - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [09:13:41] FIRING: [4x] PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance prometheus1005:9900) - https://wikitech.wikimedia.org/wiki/Prometheus - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [09:14:22] ^^ looking [09:18:41] RESOLVED: [4x] PrometheusRuleEvaluationFailures: Prometheus rule evaluation failures (instance prometheus1005:9900) - https://wikitech.wikimedia.org/wiki/Prometheus - https://alerts.wikimedia.org/?q=alertname%3DPrometheusRuleEvaluationFailures [09:28:29] ^^ There were two gNMIc series with the same name exported by netflow200[34] ... self resolved [12:49:40] FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logging-sd1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:49:40] FIRING: SystemdUnitFailed: curator_actions_cluster_wide.service on logging-sd1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:04:25] RESOLVED: SystemdUnitFailed: curator_actions_cluster_wide.service on logging-sd1001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [17:52:25] FIRING: SystemdUnitFailed: wmf_auto_restart_grafana-loki.service on grafana2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [21:52:40] FIRING: SystemdUnitFailed: wmf_auto_restart_grafana-loki.service on grafana2001:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed