[00:10:28] <rzl>	 belatedly: nothing to report from Americas oncall today
[12:35:15] <effie>	 taavi: it is set on hieradata/role/eqiad/wmcs/openstack/eqiad1/cloudweb.yaml 
[12:35:49] <taavi>	 that does not apply to cloudweb2002-dev, which is in codfw and uses the codfw1dev role and not the eqiad1 one
[12:36:03] <effie>	 I see 
[12:36:28] <taavi>	 i'll send a patch
[12:37:09] <effie>	 that is ok, I'll sort it 
[12:42:44] <taavi>	 https://gerrit.wikimedia.org/r/c/operations/puppet/+/1034483
[13:20:23] <effie>	 taavi: anything outstanding?
[13:26:55] <taavi>	 effie: nothing I can see, thanks!
[13:27:10] <effie>	 cheers
[15:02:40] <eoghan>	 cwhite, arnoldokoth: Nothing to report from EU oncall
[15:11:29] <cwhite>	 Thanks!
[15:22:15] <arnoldokoth>	 eoghan: thank you.
[16:33:02] <hnowlan>	 I'm seeing changes for the manufacturer attribute for ps1-b3-magru when running sre.puppet.sync-netbox-hiera - safe to merge? 
[16:33:38] <hnowlan>	 for ps[1234]-b4-magru actually
[16:33:42] <XioNoX>	 hnowlan: yep
[16:33:46] <hnowlan>	 cool, thanks 
[16:33:50] <XioNoX>	 robh: ^
[17:50:18] <robh>	 oh, I had no idea that affected hiera
[17:50:23] <robh>	 apologies hnowlan
[17:50:33] <robh>	 was fine to merge yes i modified the netbox entries
[17:50:54] <hnowlan>	 no worries
[17:52:37] <robh>	 had to run out for doggo walkin while it wasnt 80 degrees F
[17:53:14] <robh>	 cuz now its 80F/26.6C and the sun is killer.
[17:57:03] <bblack>	 I'd kill for 80F.  Summer in Houston has finally really begun this week!
[17:58:48] <bblack>	 the worst day in our short-term forecast is 6 days out on memorial day: 98F high, and 79F for the overnight low :P
[18:09:27] <Krinkle>	 jayme: claime: apologies for the potentially dumb question. About T359640, T365265 - I suppose we have ruled out installing statsd-exporter on a normal host, i.e. one like where we host legacy statsite relays today?
[18:09:27] <stashbot>	 T359640: mediawiki_resourceloader_build_seconds_bucket  big metric on Prometheus ops - https://phabricator.wikimedia.org/T359640
[18:09:28] <stashbot>	 T365265: Create a per-release deployment of statsd-exporter for mw-on-k8s - https://phabricator.wikimedia.org/T365265
[18:10:16] <Krinkle>	 I suppose one reason, besides general k8s complexity/benefits, is that maybe the exporter is too slow/inefficient to handle all of a given data center, unlike the C implementation for statsite which is presumably a lot more efficient.
[18:10:31] <Krinkle>	 but I don't actually know that for a fact, so I thought I'd mention it just in case.
[18:12:00] <Krinkle>	 If it does scale to taking in all (new) statsd messages from MW pods in a given DC, like the old statsite was able to handle, then that might offer a simple solution that should save several orders of magnitude in label explosion. 
[20:28:21] <cwhite>	 Krinkle: Limiting ourselves to one UDP receiver instance is simpler, but still a SPOF.  We cannot perform maintenance without losing data in the current statsite arrangement.
[20:29:01] <cwhite>	 In addition, the exporter instance(s) have to be partitioned by MW version (T359497).  Changes to the metric signature leads to dropped metrics.
[20:29:02] <stashbot>	 T359497: StatsD Exporter: gracefully handle metric signature changes - https://phabricator.wikimedia.org/T359497
[20:31:59] <Krinkle>	 cwhite: I believe maintenance like OS upgrades do happen today withotu data loss. I'm guessing we switch the canonical/service DNS name to a standby/replacement. Anyway, I don't mean to literally suggest a single node, as much as to explore something that isn't multiplied/distributed by a large N. Indeed, you'd want multiple nodes which is fair.
[20:32:34] <Krinkle>	 The signature break is a good point though, that's something where Prometheus is inherently differnet and perhaps justified the big change.
[20:32:48] <Krinkle>	 jusifies* such a big change.
[20:32:54] <Krinkle>	 Thanks for pointing that out.
[20:33:59] <Krinkle>	 cwhite: so this is specific to clients and not an issue with the prometheus server/storage layer, right? That layer handles new labels gracefully?
[20:34:20] <Krinkle>	 i.e. queries over time that don't specify the new label, get continuity?
[20:37:46] <cwhite>	 Correct.  statsd-exporter creates a prometheus metric instance in memory and adds new samples to those metrics based on the signature.
[20:39:11] <cwhite>	 Prometheus server accepts the metrics exposition format and turns them into timeseries data.