[08:39:48] kfserving-system kfserving-controller-manager-0 1/1 Running 0 33s [08:39:53] \o/ \o/ [08:40:27] in theory, we are now ready to test one inference service :O [08:42:44] Noice! [08:43:49] I am pretty sure it will not work the first time :D [08:44:05] we can try one using kubectl apply -f [08:47:17] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Install KFServing standalone - https://phabricator.wikimedia.org/T272919 (10elukey) Finally! ` elukey@ml-serve-ctrl1001:~$ sudo kubectl get pods -A NAMESPACE NAME READY STATUS RESTART... [09:14:05] ah one thing missing is the secret for swift [09:17:36] following https://github.com/kubeflow/kfserving/tree/master/docs/samples/storage/s3 [09:20:52] that seems in need of another dockerfile [09:21:00] https://github.com/kubeflow/kfserving/blob/master/python/storage-initializer.Dockerfile [09:36:56] 10Machine-Learning-Team: ML Serve controller vms show a slowly increasing resource usage leak over time - https://phabricator.wikimedia.org/T287238 (10elukey) p:05High→03Medium [09:44:01] seems easy enough, will work on it today [10:22:06] of course the build takes ages for some reason, maybe something is stuck, will investigate after lunch [10:28:49] * elukey lunch [14:13:52] interesting, I am trying to build kfserving (the pip package) as indicated in the docker image and I get [14:13:55] 2021-08-06 15:56:43,505 [docker-pkg-build] INFO - Collecting ray[serve]==1.3.0 (from kfserving==0.6.0) (image.py:210) [14:13:58] 2021-08-06 15:56:43,777 [docker-pkg-build] INFO - ESC[91m Could not find a version that satisfies the requirement ray[serve]==1.3.0 (from kfserving==0.6.0) (from versions: 0.6.0, 0.6.1, 0.6.2, 0.6.3, 0.6.4, 0.6.5, 0.6.6, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.7.5, 0.7.6, 0.7.7, 0.8.0, 0.8.1, 0.8.2, 0.8.3, 0.8.4, 0.8.5, 0.8.6, 0.8.7, 1.0.0rc0, 1.0.0rc1, 1.0.0rc2, 1.0.0, 1.0.1, 1.0.1.post1) [14:14:46] mmmm the there is a ray package 1.3.0 [14:15:20] What about the [serve] [14:17:07] It's in the source [14:17:40] yep I know I kinda assumed it was following the same versioning, it is just a specification to add [14:17:54] reproduced as well in a brand new venv, no bueno [14:19:21] It's definitely defined [14:35:54] I see it seems that 1.3.0 is not there, I missed [14:35:55] ERROR: Could not find a version that satisfies the requirement ray[serve]==1.3.0 (from kfserving) (from versions: 1.4.1, 1.5.0, 1.5.1) [14:40:37] Hmm wonder why [14:42:01] it is probably something like https://github.com/ray-project/tune-sklearn/issues/169 [14:42:04] elukey: are you running Python 3.9? [14:42:36] on my laptop yes, but not on the Docker image that I am testing, that uses Buster [14:43:33] On py3.9 there's nothing until 1.4.1 [14:43:39] But 37 definitely exist [14:44:11] https://files.pythonhosted.org/packages/0b/d0/33b6f8789cec27ac07e33de987eef9430b211c7d8284437371e45d37bbd5/ray-1.3.0-cp37-cp37m-manylinux2014_x86_64.whl [14:44:52] yes but see https://github.com/ray-project/ray/issues/5444 [14:45:19] but in theory there is 3.7.3 on buster [14:48:15] That says it's conda specific [14:48:48] they also mention it to be pip specific, weird [14:49:30] aaahha I re-tried to build the image and now it works [14:49:38] * elukey cries in a corner [14:49:41] Maybe docker just brain farted [14:49:56] 2021-08-06 16:47:54,183 [docker-pkg-build] INFO - Collecting ray[serve]==1.3.0 (image.py:210) [14:49:59] 2021-08-06 16:47:54,404 [docker-pkg-build] INFO - Downloading ray-1.3.0-cp37-cp37m-manylinux2014_x86_64.whl (49.7 MB) (image.py:210) [14:50:02] \o/ [14:50:17] Yey [15:01:41] it seems to be intermittent, I am trying to re-build the image again [15:06:29] but it is probably due to my local docker-pkg env [15:12:24] ok it seems that pip needs to be upgraded [15:59:38] https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/710584 [15:59:42] this is for the storage initializer [15:59:48] o/ [16:00:31] elukey: nice, i didn't even think of the storage initializer, good catch! [16:06:39] accraze: o/ I thought about it after you mentioned the s3 credentials :D [16:11:30] accraze: if you have time we can chat about some ideas for the model namespaces, I had a chat with Janis the other day and he game me some good feedback [16:11:36] (anytime even next week( [16:13:30] yeah for sure, i was wondering about that [16:13:38] wanna do it like next tues? [16:14:38] yes anytime (even now if you want) [16:14:50] oh yeah im free now actually [17:01:34] * elukey afk! o/