[08:21:17] kevinbazira: o/ the wikidata pods are up afaics, it needed a bit of time [08:22:03] elukey yep, I checked and they are up and runnig [08:22:04] wikidatawiki-goodfaith-predictor-default-7c8hr-deployment-87kdp 3/3 Running 0 14h [08:22:14] I wonder how much time they needed ... [08:22:43] so I think it is a side effect of changing the kserve-inference chart [08:22:56] since all pods in the namespace need to be re-created [08:23:18] the wikidata ones were terminated correctly, but IIUC k8s tries to spin up new pods in a controlled way [08:23:36] and possibly knative has also some backpressure mechanism to avoid congestion [08:24:00] or maybe there is a little bug in our knative version (that we can't really change for the moment) [08:31:16] ok... this means whenever there's an update on the chart, we should wait before checking the pods. [08:31:37] isn't this where the staging servers come in? [08:35:33] in theory it is independent from staging, since the pods needs to be all recycled even in production when we change the chart [08:35:45] the main issue is that we have big k8s namespaces, with a lot of pods [08:36:16] (even if we broke down revscoring-editquality into three smaller chunks, reverted-goodfaith-damaging) [08:40:09] I see ... so in our case, what would be an optimal "size" for k8s namespace? [08:40:59] 10/15 I'd say [08:41:21] 10 - 15 pods? or isvcs? [08:41:31] in our case it is the same thing [08:41:42] (I mean in the predictor-only revscoring pods) [08:41:48] let's say 10/15 pods [08:43:05] we'll try to do it probably in the future with non-revscoring isvcs [08:43:15] it makes sense now ... that is a small number compared to the many editquality pods we had to create [08:43:41] this means we also benefited alot from removing the transformers [08:44:02] yep yep [10:26:46] * elukey lunch [14:04:58] Morning all! [14:11:58] o/ [14:16:05] * elukey bbiab [15:36:30] aiko: o/ thanks a lot for the review of the ORES python 3.7 stuff, I tried to answer your questions in the task [15:36:34] lemme know if you need more info [16:00:42] so the good news is that I was able to use uwsgi (http server) on deployment-ores01 (stretch) and celery on deployment-ores02 (buster), pointing them to the same redis instance. This should mimic the status of an ORES cluster once we reimage the first nodes [16:00:46] nothing horrible to register [16:13:37] so I'd say that on Monday we can upgrade the first node (I'd say ores2001) [16:19:21] Monday! [16:26:54] elukey: I didn't know the git cherry pick thing!! That seems a useful tool. Thank you for that. learned something :) [20:24:06] elukey: any tips for making minikube not burn my laptop down? [20:24:16] :)