[03:38:12] FIRING: PrometheusZombieSeriesDetected: Zombie series detected on k8s (eqiad) - https://wikitech.wikimedia.org/wiki/Prometheus#Runbooks - https://grafana.wikimedia.org/d/taff979/prometheus-tsdb-cardinality-monitoring?orgId=1&from=now-14d&to=now&timezone=utc&var-prometheus=k8s&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DPrometheusZombieSeriesDetected [07:38:12] FIRING: PrometheusZombieSeriesDetected: Zombie series detected on k8s (eqiad) - https://wikitech.wikimedia.org/wiki/Prometheus#Runbooks - https://grafana.wikimedia.org/d/taff979/prometheus-tsdb-cardinality-monitoring?orgId=1&from=now-14d&to=now&timezone=utc&var-prometheus=k8s&var-site=eqiad - https://alerts.wikimedia.org/?q=alertname%3DPrometheusZombieSeriesDetected [16:44:51] hi. the "SystemdUnitFailed" is default and applied to every server by default, right? Was there way to selectively turn it on and off in Hiera / for specific hosts? [16:49:41] ah, seems like "profile::monitoring::notifications_enabled: false" could be it but it's for all other alerts too [17:22:05] mutante: it should be possible to turn off monitoring for specific systemd units (and via normal hiera mechanisms, only on specific hosts or w/e) [17:24:53] cdanis: for specific systemd units would be interesting, nice! what I really want is some code like "if not active_server then skip monitoring this unit" but don't want to do it in hieradata/hosts/ because then I am just adding another thing we have to remember when changing the active server [17:26:08] mutante: https://codesearch.wmcloud.org/search/?q=+monito.*_enab.*&files=systemd&excludeFiles=&repos= [17:26:19] you could condition it on the active host hiera [17:26:22] that's very common [17:29:38] cdanis: oh, right, in systemd::service directly.. eh.. yea.. i have a bit of a special case here because I am getting that from another module that is used in 2 different setups.. but yea.. that should be possible to add :) [17:29:54] yeah just plumb it through [17:30:03] ACK, will try that. thanks for the inspiration [18:24:06] it's complicated(tm) because of what I would call anti-patterns in code.. but ..fixable :) [20:48:29] hmm. when setting $monitoring_enabled I am getting "Must provide $monitoring_notes_url if $monitoring_enabled" though. but that would mean it's off and I still get notifications by default.