[09:59:58] catching up on older chat, I hadn't seen that performance operator. It's deprecated btw already per https://github.com/openshift-kni/performance-addon-operators, so those docs are probably already out of date. Supposedly functionality has been moved under the node tuning operator https://github.com/openshift/cluster-node-tuning-operator. It's [09:59:58] interesting work, we might have some use cases for some of these things in the future. But as far as I know, not now. [10:02:22] mamu: we didn't take that into account. It would change the outcome btw, I think. If anything it would weigh against cri-o. Kubernetes release schedule policy is grueling, some would argue even punishing. Having 2 components that follow that would make things more difficult in the current situation. [10:02:33] it would NOT* [15:25:43] akosiaris: I wasn't aware it was deprecated, things are moving so fast in the kubernetes space! The cluster-node-tuning-operator does look interesting. Given that openshift is focused on kubernetes clusters hosted on premises, I see some value in keeping abreast of their work in the space. Perhaps we can leverage pieces of it in the future. Thanks for taking the time to look into it! [15:36:58] akosiaris: ok, good to know, thanks for explaining. [16:25:03] If I wanted to hit wikikube APIs (read-only) from outside the cluster, where would I start? Just wondering if we have anything that does that already [16:31:40] inflatador: we have - but what do you need that for? :) [16:34:37] jayme flink has a tendency to forget about its checkpoints, which take up a ton of space in object storage. If we can figure out Flink's current checkpoint, we can take that info and delete the checkpoints we don't need anymore [16:34:46] more context here: https://phabricator.wikimedia.org/T348685#9506640 [16:35:34] oh boy :) [16:36:27] `kubectl get flinkdeployments.flink.apache.org -l release=commons -o json | jq '.items[0].status.jobStatus.jobId' -r` more or less gets us what we need from the k8s side [16:36:58] tbh this feels like it should maybe be a cronjob in the flink chart [16:37:48] ah, you wrote that in the doc already...I see [16:39:44] Yeah, I'd be fine w/implementing via cronjob , that could make things easier w/secrets and API access [16:39:57] before calling your kubectl command from above, you do select a cluster and env with kube-env. All that does it populate $KUBECONFIG with a path to a kubeconfig file with proper access [16:40:33] that file you can use in your script as well [16:42:06] to be clear, you're suggesting that I run this as a k8s cronjob in the flink app namespace (like rdf-streaming-updater)? [16:42:58] at the end, yes. AIUI it's something that need to be periodically done for every flink app, no? [16:44:40] IIUC correctly that's also what you're proposing as end result in the google doc [16:45:03] Yes, agreed. Just making sure we're on the same page ;) [16:46:13] I'd probably skip the cookbook step. Not sure how much use that is compared to the PoC script [16:48:38] I mean..if you have the script you can simply package that into a container and extend the flink-app chart with a cronjob spawning that [16:50:33] Nice. For the PoC I'd need to source the environment variables from the deploy servers...guessing that would be the same for running inside the container as well? [16:53:51] I think I'd want some read-only creds for the script...but we can work that out later [16:54:59] the user you get from "kube-env rdf-streaming-updater staging" it pretty stripped down [16:55:52] inside the container, the kubeconfig will be at a predictable path in the end. The kubernetes python library is probably able to look that up automagically [16:57:13] Ah nice. Maybe we don't need to worry about perms then [17:13:00] I'm not sure if the default service account will be able to list the flinkdeployments (probably not), but that's easy to fix later by deploying a dedicated service account with a proper rolebinding together with the cronjob [17:15:14] # kube-env admin staging; kubectl auth can-i get flinkdeployments.flink.apache.org --as=system:serviceaccount:rdf-streaming-updater:default [17:15:15] no [17:16:09] probably lacks a " -n rdf-streaming-updater" to be accurate, but the answer is still no ;) [17:48:51] ACK, looks promising though ;)