[07:09:22] <godog>	 greetings
[08:05:47] <dcaro>	 morning!
[08:15:49] <dhinus>	 morning
[08:44:33] <dcaro>	 dhinus: can I get a review of https://gitlab.wikimedia.org/repos/cloud/toolforge/components-api/-/merge_requests/158 ? it's blocking moving the default buildpack to the newer one (if we do, components will rebuild on every deployment)
[08:45:04] <dhinus>	 dcaro: looking!
[08:45:37] <dcaro>	 thanks!
[08:54:43] <dhinus>	 I +1d it if you need to merge it now, ideally I would split the unrelated changes/cleanups to a separate MR
[09:42:02] <dcaro>	 they are related though, I have to update the models to bring the builds-api new flag, so I can use it in the config
[09:42:21] <dcaro>	 the only one unrelated is the small fix in values.yaml to fix the local deployment
[09:47:21] <dhinus>	 ack, thanks for the replies, I thought JobsUpdateResponse was not required
[09:49:48] <dcaro>	 it would fail as now the JobsJobResponse does not have the job_changed property anymore
[09:50:15] <dhinus>	 yes I tried locally and got the failing tests...
[09:50:59] <dcaro>	 I could have done the model update first, then adding the flag to the config though
[09:55:06] <dhinus>	 I added a follow-up comment on the change to the default value... I'm still not understanding fully the flow :)
[09:55:59] <dhinus>	 yes maybe doing model update in a separate commit can make reviews easier, not a big issue anyway
[10:07:33] <dcaro>	 ooohhh, the local values fix actually breaks prod deployment, looking
[11:20:01] <taavi>	 i don't like how often toolforge prometheus has been restarting
[11:20:53] <taavi>	 dcaro: when you cleared the data earlier this week, did you do it on both servers or on one of them only?
[11:21:55] <dcaro>	 One only, the one that was crashing
[11:26:23] <taavi>	 which one was that? -8?
[12:13:41] <dcaro>	 Hmmm, not sure now, let me look at the logs (sorry for the delay, was having lunch)
[12:15:12] <dcaro>	 Hmm, sal does not have it, maybe I did both, not sure, if both were having issues I did both
[12:15:31] <taavi>	 I am looking at T421242, anyone has opinions on the 24G/32G of ram question?
[12:15:32] <stashbot>	 T421242: New flavor for the integration project with more vCPU and ephemeral disk space - https://phabricator.wikimedia.org/T421242
[12:18:33] <godog>	 not really strong opinions no
[12:20:18] <dcaro>	 LGTM, no concerns, our cloudvirts are quite big
[12:26:31] <taavi>	 alright, sent a MR for 32 on the assumption that would actually improve things for CI
[12:43:25] <dcaro>	 hmm, istio in lima-kilo is not starting up on a new instance, complaining it does not have enough memory
[12:43:26] <dcaro>	 │   Warning  FailedScheduling  2m29s (x5 over 3m5s)  default-scheduler  0/1 nodes are available: 1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.                                                                                                                                          │
[12:43:29] <dcaro>	 looking
[12:43:53] <dcaro>	 it requests 5G
[12:47:36] <dhinus>	 I also noticed istio was not starting on my lima-kilo, but did not investigate the cause
[12:47:46] <dhinus>	 can we lower the request for lima-kilo only?
[12:48:25] <dcaro>	 hmm, there's a configmap with the limits embedded, I'll try to see if I can template the limits out of it or we'll have to copy the whole thing for local with the resource difference
[12:49:39] <taavi>	 yeah, we should do that but the templating might be a bit tricky
[13:43:40] <volans>	 for what are tools-cumin-1 and toolsbeta-cumin-1 used for nowadays?
[13:45:27] <taavi>	 volans: queries that need puppetdb access
[13:45:49] <volans>	 local puppetdb I guess
[13:46:04] <volans>	 from the tools's master
[13:46:30] <taavi>	 yes
[13:47:16] <volans>	 I'm deciding how to release cumin 6.0.0 and was thinking to push it on apt.w.o and installing it on cloudcumin* so that we can test it a bit (including cookbooks) and if all goes well push to prod's cumin hosts
[13:47:25] <volans>	 so trying to understand how critical are the instances in cloudvps
[13:48:04] <volans>	 the others being: beta is not critical, integration I'll ping antoine, mariadbtest federico
[13:49:12] <taavi>	 not very
[13:49:55] <volans>	 if there is any concern with my plan
[13:51:12] <dhinus>	 according to my own .bash_history on tools-cumin-1, the last time I used it was 2 years ago :D
[13:51:30] <dcaro>	 yep, i have not used it in quite long time too (I use cloudcumin mostly)
[13:52:18] <volans>	 great, thanks for the feedback
[13:53:34] <taavi>	 are there any recent logs/jobs-api changes which might explain T421929?
[13:53:35] <stashbot>	 T421929: `toolforge jobs logs` misplaces my logs - https://phabricator.wikimedia.org/T421929
[14:01:30] <dcaro>	 there's one adding a since/until options, might explain that, not sure if it was deployed before or after
[14:01:38] <dhinus>	 re: istio and lima-kilo, I tried lowering the mem request, but the pod is now failing with "Failed to create temporary file"
[14:02:27] <dcaro>	 my istio is up and running without errors (that I see at least)
[14:45:43] <volans>	 ok I've release cumin 6.0.0 to pypi, apt.w.o and the cloudcumin hosts. On the latter ones the previous deb is in my home in case you need to rever in an emergency.
[18:35:03] * dcaro off