[07:33:00] <isaranto>	 good morning folks o/
[08:31:30] <wikibugs>	 06Machine-Learning-Team: Investigate kserve 0.13.0 upgrade - https://phabricator.wikimedia.org/T367048#10287468 (10isarantopoulos) encountered the following issue when running revscoring:  ` Traceback (most recent call last):   File "/srv/rev/revscoring_model/model.py", line 5, in <module>     import kserve   Fi...
[08:34:26] <wikibugs>	 (03PS3) 10Ilias Sarantopoulos: revscoring: upgrade kserve to 0.13.1 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1085625
[08:35:02] <wikibugs>	 (03PS4) 10Ilias Sarantopoulos: revscoring: upgrade kserve to 0.13.1 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1085625 (https://phabricator.wikimedia.org/T367048)
[08:56:33] <wikibugs>	 (03CR) 10Kevin Bazira: [C:03+1] revscoring: upgrade kserve to 0.13.1 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1085625 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos)
[09:01:13] <wikibugs>	 06Machine-Learning-Team, 05Goal: Goal 2:  People outside the ML team can ssh into an ml-lab machine, run a Jupyter Notebook, and run PyTorch powered by a GPU. - https://phabricator.wikimedia.org/T371396#10287519 (10isarantopoulos)
[09:01:14] <wikibugs>	 06Machine-Learning-Team: [ml-lab] Use a (jupyter) notebook and load a LLM from huggingface - https://phabricator.wikimedia.org/T377574#10287521 (10isarantopoulos)
[09:01:15] <wikibugs>	 06Machine-Learning-Team: ml-lab should have documentation - https://phabricator.wikimedia.org/T376974#10287522 (10isarantopoulos)
[09:11:32] <isaranto>	 I'll be deploying revscoring changes to ml-staging for more testing (run the httpbb tests etc)
[09:11:38] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: [C:03+2] revscoring: upgrade kserve to 0.13.1 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1085625 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos)
[09:12:20] <wikibugs>	 (03Merged) 10jenkins-bot: revscoring: upgrade kserve to 0.13.1 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1085625 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos)
[09:29:48] <isaranto>	 here it is -> https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1087130
[09:35:10] <elukey>	 isaranto: o/ as FYI https://phabricator.wikimedia.org/T279621
[09:35:40] <elukey>	 "apus" is what it was previously called MOSS, basically a proper object storage outside thanos
[09:36:10] <elukey>	 I'd follow up with Data Persistence to migrate the ML models over
[09:36:27] <isaranto>	 o/ elukey thanks for bringing this up!
[09:36:35] <elukey>	 in theory the space requested is not big atm, but they will probably ask some ballpark numbers for the future etc..
[09:37:04] <elukey>	 and since it offers an S3 API, all the models servers should behave just fine after the move
[09:44:21] <isaranto>	 elukey: I need to catch up on my reading and updates on ceph. iiuc apus is the layer on top of ceph that provides the s3 compatible api. is that right?
[09:44:54] <elukey>	 exactly yes
[09:45:10] <elukey>	 thanos has swift behind the scenes, apus has ceph
[09:45:21] <elukey>	 but you'll deal with S3 APIs, so it shouldn't change anything
[09:46:17] <isaranto>	 ack, thank you!
[09:55:33] <aiko>	 morning!
[09:59:34] <isaranto>	 o/ aiko 
[10:40:50] <wikibugs>	 06Machine-Learning-Team: Run unit tests for the inference-services repo in CI - https://phabricator.wikimedia.org/T360120#10287964 (10kevinbazira) 05Open→03In progress a:03kevinbazira
[10:44:25] <wikibugs>	 (03PS1) 10Kevin Bazira: test: automate running unit tests in the LW repo [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1087146 (https://phabricator.wikimedia.org/T360120)
[10:45:30] <wikibugs>	 (03CR) 10CI reject: [V:04-1] test: automate running unit tests in the LW repo [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1087146 (https://phabricator.wikimedia.org/T360120) (owner: 10Kevin Bazira)
[10:45:42] <klausman>	 Morning!
[10:48:19] <isaranto>	 Morning!
[11:09:29] <wikibugs>	 06Machine-Learning-Team: Update output schema for reference risk model - https://phabricator.wikimedia.org/T378939 (10achou) 03NEW
[11:13:24] <wikibugs>	 (03CR) 10Kevin Bazira: "recheck" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1087146 (https://phabricator.wikimedia.org/T360120) (owner: 10Kevin Bazira)
[11:19:58] <wikibugs>	 (03PS2) 10Kevin Bazira: test: automate running unit tests in the LW repo [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1087146 (https://phabricator.wikimedia.org/T360120)
[11:20:53] <wikibugs>	 06Machine-Learning-Team: Update output schema for reference risk model - https://phabricator.wikimedia.org/T378939#10288085 (10achou) Hi @FNavas-foundation, I would like to confirm if the enterprise team is fine with this option. Thank you!
[11:27:14] <wikibugs>	 (03PS1) 10Nik Gkountas: Use random sorting only for topic-based recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1087156 (https://phabricator.wikimedia.org/T377124)
[11:28:14] <wikibugs>	 (03PS2) 10Nik Gkountas: Use random sorting only for topic-based recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1087156 (https://phabricator.wikimedia.org/T377124)
[11:35:26] <wikibugs>	 (03CR) 10Kevin Bazira: "I tested this locally by running:" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1087146 (https://phabricator.wikimedia.org/T360120) (owner: 10Kevin Bazira)
[12:05:31] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: "I think it would be better that we figure out a solution for all model servers and have one command that can be enabled/disabled and take " [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1087146 (https://phabricator.wikimedia.org/T360120) (owner: 10Kevin Bazira)
[12:21:27] <isaranto>	 articlequality on ml-staging overrides the prod image so I filed another patch to update it https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1087157
[12:21:42] <isaranto>	 I'm deploying all the revscoring services in staging to test them
[13:44:30] <wikibugs>	 06Machine-Learning-Team: Update output schema for reference risk model - https://phabricator.wikimedia.org/T378939#10288525 (10FNavas-foundation) @achou yes that's sensible! thank you for alerting me
[14:01:44] <wikibugs>	 06Machine-Learning-Team, 06Data-Engineering, 06Research, 10Event-Platform: Expose revision revert risk scores in EventStreams - https://phabricator.wikimedia.org/T326179#10288798 (10Ottomata)
[14:48:17] <chrisalbon>	 good morning all
[14:52:33] <isaranto>	 hi Chris!
[15:30:24] <isaranto>	 kevinbazira: thanks for working on automating the CI tests! I'm here if you want to discuss my review above so that we can figure out a way to do this if possible
[15:39:00] <wikibugs>	 (03CR) 10Eamedina: [C:03+1] Use random sorting only for topic-based recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1087156 (https://phabricator.wikimedia.org/T377124) (owner: 10Nik Gkountas)
[15:39:44] <kevinbazira>	 isaranto: thanks for the review. happy to discuss this. I've added my thoughts in a comment: https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/1087146/comments/b1a52242_837aa900
[15:40:37] <isaranto>	 thanks, I missed that! 
[15:42:01] <isaranto>	 I'm trying to see if there is a way that we wouldn't require to have different configuration per model server in order to make onboarding new models easier
[15:42:34] <kevinbazira>	 yeah. that would be great. I thought about it too.
[15:44:06] <kevinbazira>	 given that each model-server has its own dependencies and likely tests. then it might help to configure separate tox test envs for each model-server. 
[15:47:11] <isaranto>	 yes but the requirements file is defined in blubber so we could use a common structure in blubber to define one command for running tests in all envs
[15:47:46] <isaranto>	 then we could add specific tox configuration on a per need basis, so this would be the exception
[15:54:31] <kevinbazira>	 ok in the blubber test variant we would have to:
[15:54:31] <kevinbazira>	 1. install a specific model-servers dependencies
[15:54:31] <kevinbazira>	 2. use entrypoint.sh to run tests e.g tox -e new_model_server
[15:54:31] <kevinbazira>	 wdyt?
[15:58:43] <isaranto>	 1. yes
[15:58:43] <isaranto>	 2. it could just be one command for all model servers e.g. tox -e run-ci-with-tests that runs pre-commit along with tests
[16:00:12] <isaranto>	 since pytest searches directories we won't need to specify dirs, and any customization can happen via blubber (copy the required files to the required dir)
[16:01:10] <isaranto>	 it has really been a while since I tried this on this patch https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/982396 and I don't remember the steps needed for it to work 
[16:02:38] <isaranto>	 if you also agree on the approach you could try sth out and let me know if you encounter any issues. 
[16:15:29] <kevinbazira>	 this approach would be great.
[16:15:29] <kevinbazira>	 1. ok
[16:15:29] <kevinbazira>	 2.1. if we don't specify dirs won't tests for other model-servers fail or we plan to copy only tests for a specific model-server into the test variant?
[16:15:29] <kevinbazira>	 2.2. in case we continue with the path of specifying dirs, then we could achieve it in bubber by passing a dir_name to tox e.g pytest test/unit/{env:DIR_NAME}
[16:17:42] <wikibugs>	 (03PS21) 10Nik Gkountas: Support Default collections [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1072175 (https://phabricator.wikimedia.org/T374597) (owner: 10Santhosh)
[16:18:25] <wikibugs>	 (03CR) 10CI reject: [V:04-1] Support Default collections [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1072175 (https://phabricator.wikimedia.org/T374597) (owner: 10Santhosh)
[16:20:23] <wikibugs>	 (03CR) 10Sbisson: [C:03+2] Use random sorting only for topic-based recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1087156 (https://phabricator.wikimedia.org/T377124) (owner: 10Nik Gkountas)
[16:20:35] <isaranto>	 for 2.1 we only copy specific stuff on our blubber images so that would be ok.
[16:21:03] <wikibugs>	 (03Merged) 10jenkins-bot: Use random sorting only for topic-based recommendations [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1087156 (https://phabricator.wikimedia.org/T377124) (owner: 10Nik Gkountas)
[16:21:53] <kevinbazira>	 ok I'll implement 2.1 tomorrow.
[16:21:56] <isaranto>	 there is the issue there what happens with common libraries. e.g. if you change soemthing from utils python. 
[16:22:07] <isaranto>	 🙌
[16:22:43] <isaranto>	 sure, let's iterate on the idea, I'm pretty sure some issues will appear. 
[16:22:53] <isaranto>	 have a nice evening o/
[16:23:13] <kevinbazira>	 yeah ... python utils will have to be copied to the test variant as well 
[16:23:29] <kevinbazira>	 np! have a good evening too!
[16:31:23] <isaranto>	 on another topic: I noticed huggingface has updated the quantization page https://huggingface.co/docs/transformers/v4.46.0/quantization/overview
[16:31:46] <isaranto>	 it has a nice table that includes rocm and apple silicon availability for each library
[16:34:23] <isaranto>	 I stumbled into some issues trying to use them in the ml-labs today. will try to do it in a more organized way in the following days so that I can report concrete findings
[16:34:40] <isaranto>	 going afk, have a nice evening folks o/
[17:30:14] <wikibugs>	 10Lift-Wing, 06Machine-Learning-Team, 07OKR-Work: Request to host article-country model on Lift Wing - https://phabricator.wikimedia.org/T371897#10290006 (10Isaac) > We have: removed support for QID input, initialized claims as a dict, added support for async API calls @kevinbazira thanks!  Schemas are looki...
[18:32:04] <wikibugs>	 (03PS22) 10Nik Gkountas: Support Default collections [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1072175 (https://phabricator.wikimedia.org/T374597) (owner: 10Santhosh)