[06:18:02] <elukey>	 hello folks
[06:30:36] <elukey>	 accraze, kevinbazira o/ - I just noticed that machinelearning-liftwing-inference-services-editquality's image is around 2GBs, is there anything that we can do to trim it down? (I guess not but worth to ask)
[07:05:10] <kevinbazira>	 elukey o/, thank you for asking. We looked into trimming the image size at a point: https://phabricator.wikimedia.org/T272874#6891085
[07:05:28] <kevinbazira>	 One of the heaviest layers was installing all *spell language dictionaries. The consideration was that we can either use a different image for each model, and install only that model's language dictionaries. This would mean we would end up with lighter images, but if we have 100 models that is 100 images.
[07:05:36] <kevinbazira>	 Or install all language dictionaries in one image and all models use the same image. So we end up with 100 models that use 1 image.
[07:05:55] <kevinbazira>	 The second option is what we are currently running with and are happy to hear any suggestions that could help us trim further as we iterate.
[07:07:44] <elukey>	 not really easy, I suspected the spell packages were big, there seems to be not a lot of things that we can do. I think that we should present how we do images to the ServiceOps team as final review at some point, so we'll get a confirmation if what we are doing is ok (best practice wise) or if anything can be improved
[07:07:59] <elukey>	 I still need to ask about how to track python dependencies upgrades
[07:19:43] <elukey>	 kevinbazira: we currently pip-install without freezing deps right?
[07:20:09] <kevinbazira>	 yeah for sure, picking the brains of the ServiceOps team is a great idea.
[07:20:25] <elukey>	 I am asking since it may be good to formalize (in the future) package versions for pip, just to have a little bit more predictability
[07:20:31] <elukey>	 (about what ends up into an image)
[08:02:42] <kevinbazira>	 Yep, I agree. Having more predictability would be great.
[08:02:58] <kevinbazira>	 as we were setting up these images, we weren't sure what revscoring version and its dependencies were compatible with kfserving and its dependencies.
[08:03:06] <klausman>	 Are all the languages actually used (eventually)? Or might there be some dictionaries we can trim.
[08:03:21] <klausman>	 So still a "fat" layer with all languages, but only the ones we actually use
[08:03:30] <kevinbazira>	 so whenever we froze all dependencies, we would run into conflicts with dependencies that both revscoring and kfserving relied on e.g numpy
[08:04:00] <kevinbazira>	 at the moment, we are mainly freezing the kfserving dependency and rely on pip to help resolve dependency conflicts. e.g https://github.com/wikimedia/machinelearning-liftwing-inference-services/blob/main/revscoring/editquality/model-server/requirements.txt
[08:04:14] <kevinbazira>	 now that we are talking about this, I am thinking we could possibly freeze all dependencies after pip has helped resolve them. Will try this out and revert. 
[08:05:45] <elukey>	 +1 thanks!
[08:37:52] <kevinbazira>	 klausman: yep, the languages we end up not using will be removed.
[09:56:30] <elukey>	 very nice chat with Janis (serviceops), the kfserving charts seem ok, I need to change the TLS cert since I was wrong on one thing, will work on it asap
[10:33:27] * elukey lunch
[15:12:59] <accraze>	 o/
[15:13:11] <accraze>	 haha i knew elukey was going to ask about the image sizes :)
[15:14:30] <accraze>	 another part of that issue is that revscoring includes a TON of scientific libraries (sklearn/scipy/nltk/gensim/numpy/etc..) even though not all of them are used for inference
[15:15:37] <accraze>	 i mentioned earlier that we might be able to use a distro like miniconda as a base image which might make managing the python scientific stack a bit easier
[15:15:59] <accraze>	 but even then, revscoring is still kinda heavy
[15:16:58] <accraze>	 FWIW the pytorch server image that is used in KFServing is around the same size (~2GB) ...
[15:26:13] <chrisalbon>	 The images are large, but I am hesitant to work on optimizing the image sizes too early, unless serviceops is like "absolutely not you clowns". Likely better to get a model running and then worry about reducing image sizes when we could load it into lift wing and confirm the change we made to reduce image size works fine with KFServing
[15:26:21] <chrisalbon>	 Happy to be argued off the point though
[15:26:58] <chrisalbon>	 Also good morning
[15:27:03] <accraze>	 :)
[16:32:16] <wikibugs>	 10Machine-Learning-Team, 10Analytics: Update ROCm version on GPU instances. - https://phabricator.wikimedia.org/T287267 (10elukey)
[16:44:13] <chrisalbon>	 Amir1 would you have any idea when the SSO would include Wikitech? I remember you pasted a link to the ticket before. I am debating where to put documentation and Wikitech seems like the ideal choice, except it isn't part of the "one login" system which would be a blocker for some folks
[16:44:20] <wikibugs>	 10Machine-Learning-Team, 10Analytics: Update ROCm version on GPU instances. - https://phabricator.wikimedia.org/T287267 (10ACraze) +1 for upgrading ROCm to support ONNX runtime. It's certainly worth evaluating imo, as it seems that ONNX would help enable us to use an AMD GPU with any arbitrary ML-framework
[16:45:23] <majavah>	 chrisalbon: which SSO? you mean SUL (used by most public production wikis)?
[16:45:46] <wikibugs>	 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Create blubberfile for articlequality model server - https://phabricator.wikimedia.org/T287781 (10ACraze) 05Open→03Resolved
[16:45:52] <wikibugs>	 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Configure articlequality deployment pipeline - https://phabricator.wikimedia.org/T287786 (10ACraze)
[16:52:17] <Amir1>	 chrisalbon: yeah, I think it'll be quite far if you ask me
[16:52:46] <Amir1>	 chrisalbon: OTOH, moving things is not that hard, you can start with mw and move/redirect them to wikitech later
[16:56:42] <chrisalbon>	 I am deciding between MW and WT, and I keep changing my mind. One of the items would be model cards for every model we host, and I'd love for the ability of folks to not need a separate login to use the that model card's talk page to discuss an individual model. On the other hand, there is SO much document rot in MW that I am worried the UX for someone wanting to know more about WMF's ML will get lost in a sea of 
[16:56:42] <chrisalbon>	 outdated or silently deprecated documents
[16:57:12] <accraze>	 doc rot is a hard problem to solve!
[16:57:35] <chrisalbon>	 I think pywikibot and myself are going to become great friends
[16:57:40] <accraze>	 lol
[16:58:14] <chrisalbon>	 and it makes me really happy kevinbazira is on the team, since his knowledge is vastly greater than my own around frontend
[17:06:04] <elukey>	 kfserving charts merged!
[17:06:08] <elukey>	 of course they don't work
[17:06:25] <elukey>	 but it is some yaml horror, once resolved it may be good
[17:07:00] <elukey>	 accraze: thanks for the context on the docker image size, I am fine if it is needed, just wanted to follow up :)
[17:21:29] <elukey>	 ok the errors will have to wait tomorrow, gtg now, have a good rest of the day folks!
[17:25:01] <chrisalbon>	 It is the classic debate between one mega-image vs. lots of smaller images
[17:25:17] <chrisalbon>	 I've never done the mega-image strategy, but I know folks have done it
[19:26:44] <wikibugs>	 10Lift-Wing, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): Configure articlequality deployment pipeline - https://phabricator.wikimedia.org/T287786 (10ACraze) Added a pipeline config for articlequality in the inference-services repo, next we'll need to update...