[08:14:24] nicee! [08:14:32] thanks for the update :) [08:42:37] I am reviewing the docker images reported by the new docker report (still WIP, but it inspects the running containers on staging-eqiad atm) and I noticed that the prometheus-statsd exporter is still on buster :( [08:42:41] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1165467 [08:42:45] if anybody has time --^ [08:43:43] hnowlan: o/ I have an API-gateway-related question - afaics we are still using the ratelimit image running buster, and Janis a while ago made a big change from 1.5.x to 9.x afaics (default when using rate limit in scaffolding) [08:43:56] is it ok to upgrade api-gateway too? [08:48:37] <_joe_> I would imagine hugh and claime would take care of it, given they're focusing on the api gateway this quarter [08:48:45] <_joe_> elukey: reviewed your patch [08:48:51] IIRC ratelimit changed quire a bit, so I would not rush it [08:49:00] yeah that's a lot of version to jump in one go [08:49:36] 9.x is not an actual version, though [08:49:53] even so, 4 years of change [08:49:58] IIRC it's an artificial number I used to not clash with upstream [08:50:13] we already have https://phabricator.wikimedia.org/T388804 open but if it's buster it's probably time to prioritise that work [08:53:48] I'm not even sure about what/how much changed since upstream did not cut releases since 2020 hnowlan... :/ [08:58:50] hnowlan: yes please let's prioritize if possible, we should try to clean up all the buster usage during this Q if possible [09:01:41] yeah. we have a reasonably okay testbed, I'll give it a spin and see how it behaves [09:03:40] <3 [09:04:02] claime: o/ when you have time can we sync to deploy the statsd change as well? No rush [09:15:30] hmm, based on a first basic test it looks like it's behaving more or less as expected. promising [09:40:04] elukey: sure, I'll have a look in a bit [09:41:50] <3 [09:43:33] last one and then I'll stop, I promise :D [09:44:00] I filed https://gerrit.wikimedia.org/r/c/operations/puppet/+/1165477, my understanding is that the `prod-build` user is not allowed to pull /restricted/ images [09:44:10] it would be nice to have them scanned by debmonitor [09:44:13] lemme know :) [09:57:39] <_joe_> elukey: I'm not enthusiastic about the idea of allowing us to pull mw images, which contain secrets, in CI or on the build hosts [09:57:59] <_joe_> I would rather special-case checking the base image for the mw stuff [09:58:26] <_joe_> elukey: which reminds me - one thing you lack in debmonitor is the build chain relationships [10:04:37] _joe_ we could allow only the build nodes, not CI, what is the specific concern? They are generally SRE-related nodes [10:05:03] we could also have a dedicated VM for docker-report, and its special user, but it may be a bit overkill [10:05:17] I agree about the chain relationship, but I would really monitor all images if possible [10:29:18] (alternative - we create a separate config.json with a docker-report user, leaving the default in the root home dir for the rest) [10:36:54] <_joe_> elukey: my main worry is that images stick on those servers, with credentials saved in them [10:37:28] <_joe_> it means we're having them in one more place other than where we can't do without [10:37:50] <_joe_> one place where we build software in various ways, not all properly isolated. [10:38:12] <_joe_> in any case, yes, CI is out of the question; but the build hosts might be ok. [12:11:05] I've updated https://wikitech.wikimedia.org/wiki/Kubernetes/Clusters/New to include the option of running stacked control-planes (etcd colocated with the apiserver components) [12:26:17] _joe_ right makes sense.. what if we do build hosts only and docker report deletes anything with "restricted" on it after it runs the report? [12:35:50] <_joe_> that reduces the risk for sure