[08:54:47] <dcaro>	 morning!
[09:47:16] <blancadesal>	 o/
[11:53:45] <dhinus>	 hmm I see this alert in #-cloud but I don't see it in alertmanager "Node tools-k8s-worker-nfs-58 has at least 12 procs in D state"
[11:54:15] <dhinus>	 and grafana also shows "1" as the latest value https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview?orgId=1&viewPanel=2&from=now-15m&to=now
[11:54:43] <dhinus>	 now it's "resolved" in #-cloud as well
[11:59:04] <dhinus>	 ok found the explanation: the alert triggers when avg_over_time[1h] > 12, for longer than 1 hour
[11:59:12] <dhinus>	 and there was a spike at 10:22 UTC
[11:59:31] <dhinus>	 now it's back to normal, so the alert only fired very briefly
[13:54:12] <dcaro>	 might be a problem between metricsinfra alertmanager and prod one, I saw alerts not passing through the last time we had nfs issues (not saying they are related, they might, but not sure how)
[14:09:09] <dhinus>	 dcaro: it could be actually, let's keep an eye on the following alerts in -cloud and double check if they appear on alerts.wm.o
[15:59:54] <dcaro>	 👍
[17:25:41] * dcaro off, cya on monday!