[13:13:36] <jbond>	 godog: where is the acl for the alertmanager api i.e. http://alertmanager-eqiad.wikimedia.org/
[13:13:46] <jbond>	 id like to add the puppetdbs to the list
[13:16:51] <godog>	 jbond: yes, check out hieradata/common/profile/alertmanager/api.yaml
[13:16:57] <jbond>	 godog: cheers
[13:17:03] <godog>	 I'm curious as to the why jbond ?
[13:19:01] <jbond>	 godog: re T345909.  im looking to move that script to the puppetdbs as most the data is from there.  however currently the script uses spicerack to look for downtimed hosts.  
[13:19:16] <jbond>	 i was thinking of changeing that logic to insted of looking for downtimed hosts in icinge 
[13:19:23] <jbond>	 look for silenced hosts in alertmanager
[13:23:56] <godog>	 jbond: got it, actually now that I'm thinking about it, what is the check looking for? could we replace it with an alertmanager alert instead based on prometheus expression with self-reported puppet agent data from the hosts themselves ?
[13:24:18] <godog>	 simpler, and already covers the downtimed host case automatically
[13:26:21] <jbond>	 godog: hmm possibly what the script currently dose is look for hosts where every puppet run in the last 24 hours caused a change
[13:26:32] <jbond>	 and then filteres out any hosts that are downtimed
[13:26:57] <godog>	 let's see
[13:27:06] <jbond>	 so yes i guess just a check where the puppet status was changed for the last 24 hours
[13:27:59] <godog>	 yeah maybe puppet_agent_resources_changed
[13:29:34] <jbond>	 yes that looks good 
[13:32:03] <godog>	 sadly I can't compare the results with what the old check would emit, but yeah
[13:33:44] <godog>	 anyways something to think about jbond, could simplify things a bit if the alert expression is reliable
[13:34:47] <godog>	 in other words I don't know what "failing" hosts currently are
[13:35:10] <jbond>	 godog: yes thanks ill send something shortly
[13:35:35] <jbond>	 godog: yes ill try and get that out of the script in a bit
[13:35:42] <godog>	 cheers
[13:36:08] <jbond>	 godog: how do i update this so ui can see it calculated over the last 24h? 
[13:36:11] <jbond>	 https://prometheus-eqiad.wikimedia.org/ops/classic/graph?g0.range_input=1h&g0.expr=puppet_agent_resources_changed%20%3E%201&g0.tab=1
[13:37:35] <godog>	 jbond: depends on what calculation you want to do, for example avg_over_time(puppet_agent_resources_changed[1d])
[13:37:38] <godog>	 or sum_over_time
[13:38:14] <jbond>	 hmm i wante to ensure we have no values that where 0 in the last 24 hours
[13:38:27] <jbond>	 so neoither sum or avg would really work
[13:38:54] <godog>	 min_over_time(puppet_agent_resources_changed[1d]) > 0
[13:38:57] <godog>	 sth like this?
[13:39:10] <jbond>	 yes that look sgood thanks ill have a play with that
[13:39:29] <godog>	 sure np
[14:27:41] <godog>	 jbond: very cool re: dropping check_puppet_run_changes
[14:27:58] <godog>	 far easier this way
[14:28:08] <jbond>	 yes this is much better thanks fopr the pointer
[14:29:06] <godog>	 np