[09:29:23] hello folks [09:29:40] I am trying to follow up on the kfserving chart errors, and I discovered multiple interesting things [09:29:51] for one, I followed up in https://github.com/kubeflow/kfserving/issues/1591 [09:31:09] basically helm lint throws an error if we try to use CustomResourceDefinition v1beta1, so I used v1 (that IIRc worked on 0.5.1) but when deploying it doesn't work due to a change in the API [09:31:45] the other things are more kfserving-related, since I didn't notice a new web-app thing in 0.6 [09:31:50] that requires another image etc.. [09:31:57] https://www.kubeflow.org/docs/components/kfserving/webapp/ [09:35:14] I'd be inclined to avoid it for the moment [10:04:54] ok I have created a new version of the chart in https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/710226/ [10:05:10] but in order to get +2 from CI and deploy it, we'd need a new heml3 version [10:05:21] containing https://github.com/helm/helm/pull/8608 [10:06:11] * elukey cries in a corner [10:12:59] * elukey lunch [14:49:13] lol, morning sorry elukey [14:50:28] o/ [14:50:35] I just uploaded the new helm3 [15:54:08] ok now I am ready for https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/710226 [15:54:20] the idea is to removed the web-app, ok for everybody? [15:54:30] we can add it later on if needed [15:54:34] o/ [15:54:44] i didnt even know there was a webapp included in kfserving lol [15:54:58] might be useful eventually, but we don't need it yet [15:56:20] oh i think it used to be a part of the full kubeflow install but now it's been moved to kfserving standalone ..? [15:57:00] o/ [15:57:13] no idea, IIUC it is not included in the full kubeflow install [15:57:26] but it shows some challenges, like security, since you can deploy inference services [15:57:32] and modify running things etc.. [15:58:07] ah i see, yeah im ok with skipping that for now [15:59:56] also not sure how to expose it from kubernetes [16:00:05] since it ends up in the cluster local gateway [16:03:40] STDERR: Error: release kubeflow-kfserving failed, and has been uninstalled due to atomic being set: namespaces "kfserving-system" already exists [16:04:04] interesting, helm3 complains if we add the namespace in both helmfile and kubeflow-kfserving [16:10:19] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/710296 should fix [16:11:04] accraze: I am sure it will not happen, BUT, if you have time prep one InferenceService test to run via kubectl to test Kfserving :) [16:11:19] we may use it next week, or tomorrow [16:11:30] depending on how much horror helm will bring to us [16:27:46] elukey: yeah that sounds great. I'm going to update the enwiki-goodfaith service to use our newly published image from the wmf docker registry and that should be good to go [16:28:23] but yeah im sure there will be some helm/yaml horror :) [16:29:15] yes like https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/710301 [16:29:23] neverending :D [16:55:19] uff another error with secret handling [16:55:20] sigh [16:57:02] I can't decide which epic greek tragedy all this is, Odyssey or Lliad [17:28:16] ok I am now able to get up to half of the deploy [17:28:20] (via helmfile) [17:29:25] but the kfserving-manager pod doesn't come up, since [17:29:26] MountVolume.SetUp failed for volume "cert" : secret "kfserving-webhook-server-cert" not found [17:29:43] and the secret is handled by helmfile (that is waiting for the pod to come up probably) [17:30:59] I am wondering if we need a separate chart for the secret, sigh [17:41:18] oh interesting, i was wondering how secrets will work with our helm charts.... we may need to do similar for thanos swift [17:49:31] accraze: so in puppet private there is a way to add secrets to special puppet configs, that then gets rendered on deploy1002 as helmfile private yaml configs [17:49:47] that become part of .Values [17:50:15] so for thanos it will be sufficient to add credentials in puppet, and reference them in the helmfile [17:53:23] (we can check tomorrow together if you have time, it should be easy) [17:53:38] going to log off for the day! Hope to make kfserving working tomorrow :) [17:53:57] cool sounds good, see ya elukey! [21:42:08] 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Configure articlequality deployment pipeline - https://phabricator.wikimedia.org/T287786 (10ACraze) I pushed up a CR to integrations/config repo, when that is +2'd, our articlequ... [21:47:37] 10Lift-Wing, 10artificial-intelligence, 10draftquality-modeling, 10Machine-Learning-Team (Active Tasks): Create blubberfile for draftquality model server - https://phabricator.wikimedia.org/T287783 (10ACraze) a:03ACraze [23:51:19] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Production images for ORES/revscoring models - https://phabricator.wikimedia.org/T279004 (10ACraze) Pipeline is configured and image is now published in the WMF Docker registry: https://docker-registry.wikimedia.org/wikimedia/machinelea... [23:52:19] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Production images for ORES/revscoring models - https://phabricator.wikimedia.org/T279004 (10ACraze) [23:52:34] 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Configure articlequality deployment pipeline - https://phabricator.wikimedia.org/T287786 (10ACraze) 05Open→03Resolved Pipeline is configured and image is now published in the WMF Docker registry:...