[07:02:54] (03CR) 10Kevin Bazira: [C: 03+1] "Tested RRML locally and it works fine both before and after the upgrade." [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/998946 (https://phabricator.wikimedia.org/T347551) (owner: 10Ilias Sarantopoulos) [07:31:31] 10Machine-Learning-Team: Maintain models directory structure for model-server make builds to remain consistent with the analytics repo - https://phabricator.wikimedia.org/T356985 (10kevinbazira) 05In progress→03Resolved Configurations were added to the Makefile and now it maintains the models directory struc... [07:31:33] 10Machine-Learning-Team: Add a script for running the Revert Risk model server locally - https://phabricator.wikimedia.org/T352689 (10kevinbazira) [07:41:33] Good morning! [07:42:31] (03CR) 10Ilias Sarantopoulos: [C: 03+2] rrml: upgrade kserve to 0.11.2 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/998946 (https://phabricator.wikimedia.org/T347551) (owner: 10Ilias Sarantopoulos) [07:47:46] (03Merged) 10jenkins-bot: rrml: upgrade kserve to 0.11.2 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/998946 (https://phabricator.wikimedia.org/T347551) (owner: 10Ilias Sarantopoulos) [08:46:07] new catboost update - the new release is expected this week :) https://github.com/catboost/catboost/discussions/2592#discussioncomment-8410418 [10:22:58] Nice! [10:23:11] * klausman early lunch [11:02:49] 10Lift-Wing, 10Machine-Learning-Team: Debug GPU deployments on ml-staging - https://phabricator.wikimedia.org/T356038 (10isarantopoulos) I have had no luck using dumb-init a the moment. Instead I opened an issue on kserve to add the ability to hot reload the model server when new changes are made https://gith... [11:12:00] I opened this feature request for kserve. https://github.com/kserve/kserve/issues/3420 [11:12:13] I'd be happy to work on it as well! [11:14:34] when someone has some time - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/999570 [11:21:15] * isaranto lunch [11:53:13] isaranto: o/ the RRML image update LGTM! I've +1'd. [12:32:02] thanks Kevin! [13:18:56] isaranto: https://huggingface.co/allenai/OLMo-7B this is really cool [13:19:41] https://arxiv.org/pdf/2402.00838.pdf [13:20:24] IIUC they also release the training dataset, all open [13:20:45] grazie signore! [13:21:24] I love the naming of the dolma dataset [13:22:31] Also just checked and dolma uses Wikipedia and Wikibooks [13:23:51] isaranto: Parakalò (is that right??) [13:24:44] exactly! [13:24:57] παρακαλώ (with gr characters) [13:25:36] :D [14:29:56] btw kserve is moving comms from the kubeflow slack to CNCF slack [14:35:09] this is just part of the Kubeflow CNCF onboarding https://github.com/cncf/toc/issues/1139 [15:17:03] * klausman out for a quick errand (back in 20m or so) [15:48:08] 10Machine-Learning-Team, 10Patch-For-Review: Upgrade Revert Risk Multilingual docker images to KServe 0.11 - https://phabricator.wikimedia.org/T347551 (10isarantopoulos) Some load test results with wrk on staging: I ran the same ones we previously did. - Run for 10s ` isaranto@deploy2002:~/inference-services/... [15:51:36] morning all [15:51:43] I'm exhausted [15:52:51] heyo Chris. [15:53:00] I guess traveling and end-to-end meetings will do that [15:53:09] true [16:04:57] Hey Chris! [16:05:01] yo! [16:05:10] hope you have time to relax this weeknd! [16:05:15] same [17:22:01] going afk folks o/ wish you all a great weekend <3 [17:25:00] \o [17:25:08] have a great weekend, Ilias [17:55:48] Welp, that's five hours and a bit of me ruining cassandra databases (here, locally, don't worry :)). so I'm gonna head as well