[07:52:44] hello! [07:52:53] first version of the knative serving helm chart - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/699380 [15:43:11] I am getting some errors when deploying the inference service, I think that the helm chart for knative doesn't work as expected [15:43:17] lovely [15:53:47] elukey: what kind of errors are you getting with the inference service? [16:13:02] accraze: o/ it is the queue-proxy container that is missing some authorizations for pod autoscaling, I am 99% sure that I missed something in knative [16:13:16] or that the helm chart doesn't work as intended [16:14:38] ahhh i see [16:15:54] if this doesn't work I think that we should try the operator [16:16:02] https://knative.dev/docs/install/knative-with-operators/ [16:16:18] NOTE: The Knative Operator is still in Alpha phase. It has not been tested in a production environment, and should be used for development or test purposes only. [16:16:21] lol [16:33:58] lol! [16:42:15] accraze: it is a very nice feeling to run everything in alpha/beta state [16:42:17] *stage [16:43:50] yeah for reals ahah [16:45:07] oh also nice catch on that review for the python-build-buster image [16:45:25] i think what i need is just the python buster image instead [16:45:47] ah perfect I was unsure about that one [16:45:58] (actually the python3.7-slim would be best but that's not in the registry yet) [16:48:40] also the outlink topic model seems to run slightly faster using the wmf bullseye image now :) [16:50:10] wow :) [16:50:15] python3.8 should help [16:50:22] I am still getting [16:50:24] knative.dev/pkg/controller/controller.go:618: Failed to list *v1alpha1.ServerlessService: serverlessservices.networking.internal.knative.dev is forbidden: User "system:serviceaccount:default:default" cannot list resource "serverlessservices" in API group "networking.internal.knative.dev" at the cluster scope [16:50:35] this is repeated in the queue-proxy container [16:52:17] that doesn't happen if I kubectl apply -f the service-core.yaml [16:52:40] maybe missing RBAC rules [16:53:33] ohhh yeah were you just using helm? [16:54:15] exactly yes [16:54:29] am I missing something super trivial? [16:56:58] ahhh maybe with kubectl I am executing stuff as cluster admin [16:57:02] and with helm I am not [17:00:13] i was gonna say it sounds like an auth issue [17:01:02] i still fully haven't figured out rbac for the full stack (istio/knative/kfserving) [17:03:26] yes definitely [17:03:53] so the issue is in the enwiki-goodfaith-predictor-default-7fsvk-deployment-6748ccp5cc8 pod, the queue-proxy container, that is deployed in the default namespace (has always worked so far) [17:05:24] have you tried a specific namespace? kevin and I both had some issues using the default namespace iirc [17:07:29] ah! [17:07:32] it seems to work! [17:07:49] no sorry too soon [17:07:50] sigh [17:08:49] nope same thing [17:09:01] that makes sense, some auth is needed [17:09:20] this time I see User "system:serviceaccount:inference:default" cannot list resource etc.. [17:39:15] I declare defeat, will check on monday : [17:39:16] :) [17:39:20] have a good weekend folks! [17:39:44] see ya elukey, have a good weekend!