[06:50:25] 10Machine-Learning-Team, 10Language-Team, 10Epic: Migrate Content Translation Recommendation API to Lift Wing - https://phabricator.wikimedia.org/T308164 (10elukey) We are working on T338471 to figure out if the old recommendation-api service can be deprecated. [08:17:58] hi folks! [08:18:08] one interesting thing that I got from upstream - https://kserve.github.io/website/0.10/modelserving/data_plane/v1_protocol/ [08:18:17] we are not really using the v1 protocol [08:19:09] the "instances" field is not used, and that is the one that the batcher aggregates [09:10:45] Do you think it's worth the disruption to change all the request schemas? [09:10:57] (also: Morning :)) [09:13:58] morning! No idea, in theory we can do anything in the custom predictor, for example checking if "instances" is present or not etc.. [09:14:19] I asked more questions about how people use the batcher on the upstream slack, hopefully I'll get some info [09:14:23] I'll report back :) [09:14:54] need to go to a doc appt, I hope to be back in an hour but it may be more due to queues etc.. (if there is a delay my appointment will be pushed down the line) [09:16:31] 10Machine-Learning-Team, 10serviceops: Replace the current recommendation-api service with a newer version - https://phabricator.wikimedia.org/T338471 (10kevinbazira) [09:19:11] elukey: ttyl [09:28:26] o/ [09:30:39] it is nice to have some standards regarding input, but I think we can discuss if we want to do this for the model servers we already have [09:45:11] 10Machine-Learning-Team: Containerize Content Translation Recommendation API - https://phabricator.wikimedia.org/T338805 (10kevinbazira) We've been able to wrangle the dependencies ([[ https://github.com/wikimedia/research-recommendation-api/blob/master/setup.py | 1 ]], [[ https://github.com/wikimedia/research-r... [09:52:33] morning :) [10:30:41] \o [10:30:49] going for lunch, will bbiab [10:41:40] (03CR) 10Ilias Sarantopoulos: [C: 03+1] revert-risk: change output schema and add model version (033 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929175 (owner: 10AikoChou) [11:16:18] back, as I suspected the was a huge queue :D [11:16:53] isaranto: definitely yes, there is also the v2 api that they are rolling out (IIUC not compatible with batching yet though) [11:17:08] maybe at some point in the future we'll also need to migrate to it [11:19:41] when we migrate to v2 it would be nice to consider supporting gRPC as well as it may speed things up in some cases [11:21:23] could be interesting yes, I am wondering if users will like to have it or not [11:21:36] (never really worked with grpc-based apis etc..) [11:26:29] * elukey quick luch [11:26:32] * elukey quick lunch [11:28:47] 10artificial-intelligence, 10Technical-Tool-Request: Automatic summarization of articles - https://phabricator.wikimedia.org/T127038 (1001tonythomas) Sorry to ping on this very old one, but we have something going on in: T336692 [11:33:24] (03PS12) 10Ilias Sarantopoulos: feat: use Lift Wing instead of ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/926420 (https://phabricator.wikimedia.org/T319170) [11:40:27] (03CR) 10Ilias Sarantopoulos: [C: 03+2] feat: use Lift Wing instead of ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/926420 (https://phabricator.wikimedia.org/T319170) (owner: 10Ilias Sarantopoulos) [11:40:51] (03CR) 10Ilias Sarantopoulos: feat: use Lift Wing instead of ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/926420 (https://phabricator.wikimedia.org/T319170) (owner: 10Ilias Sarantopoulos) [11:41:10] (03PS13) 10Ilias Sarantopoulos: feat: use Lift Wing instead of ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/926420 (https://phabricator.wikimedia.org/T319170) [11:41:39] (03PS43) 10Ilias Sarantopoulos: feat: hardcode threshold calls to switch to Lift Wing [extensions/ORES] - 10https://gerrit.wikimedia.org/r/915541 (https://phabricator.wikimedia.org/T319170) [12:18:07] * elukey commutes to the office [12:19:03] (03PS1) 10DCausse: Unify the meta subfield in events [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 [12:27:48] (03CR) 10DCausse: "In a followup patch I'd like to introduce the top-level dt field as suggested in T267648 and propagate the event-time of the source events" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [12:44:05] isaranto, aiko, kevinbazira_ - I was checking perms to +2 on deployment-charts, and IIUC all members of 'deployers' posix group (in puppet) should be able to do it [12:44:12] and you are in the group [12:44:17] have you tried recently to +2 [12:44:18] ? [12:44:55] ah no maybe you need to be added to the gerrit group [12:50:02] just checked and I don't have +2 [12:50:03] 10Machine-Learning-Team, 10Gerrit-Privilege-Requests: Add isaranto, kevinbazira and aikochou shell users to the wmf-deployment - https://phabricator.wikimedia.org/T338947 (10elukey) [12:50:08] created --^ [12:50:19] in theory after it you should be unblocked [12:50:31] so SREs will +1 and you'll be free to +2/merge as always [12:50:54] thank uuu [12:51:56] 10Machine-Learning-Team, 10Gerrit-Privilege-Requests: Add isaranto, kevinbazira and aikochou shell users to the wmf-deployment - https://phabricator.wikimedia.org/T338947 (10taavi) 05Open→03Resolved a:03taavi Done. [12:52:37] already done wow [12:52:43] verified that I now do have +2 [12:57:19] With great +2 comes great responsibility :) [12:57:43] klausman: from now on we just +1 on deployment-charts ok? [13:00:35] I'll also write to slack [13:08:26] Ack, sgtm [13:17:42] 10Machine-Learning-Team, 10serviceops, 10Patch-For-Review: Replace the current recommendation-api service with a newer version - https://phabricator.wikimedia.org/T338471 (10akosiaris) >>! In T338471#8914650, @elukey wrote: > Thanks for the info! > >>>! In T338471#8914520, @akosiaris wrote: >> Oh I forgot t... [13:26:58] 10Machine-Learning-Team, 10serviceops, 10Patch-For-Review: Replace the current recommendation-api service with a newer version - https://phabricator.wikimedia.org/T338471 (10elukey) >>! In T338471#8927496, @akosiaris wrote: >>>! In T338471#8914650, @elukey wrote: >> Thanks for the info! >> >>>>! In T338471#... [13:31:19] (03CR) 10Elukey: "Wow first commit outside ML! \o/" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [13:31:51] (03CR) 10Elukey: Unify the meta subfield in events (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [14:01:59] this is nice! https://github.com/bentoml/OpenLLM [14:02:41] bentoML is a platform/software like kserve. It would be great to examine something similar for kserve [14:11:13] how does it differ? [14:18:58] (03CR) 10DCausse: "thanks for the review Luca!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [14:19:08] (03PS2) 10DCausse: Unify the meta subfield in events [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 [14:20:15] (03CR) 10Ottomata: Unify the meta subfield in events (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [14:20:49] (03CR) 10Ottomata: Unify the meta subfield in events (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [14:23:25] 10Machine-Learning-Team, 10serviceops, 10Patch-For-Review: Replace the current recommendation-api service with a newer version - https://phabricator.wikimedia.org/T338471 (10elukey) a:03elukey [14:26:41] 10Machine-Learning-Team, 10ORES, 10Patch-For-Review, 10Platform Team Initiatives (New Hook System): Update ORES to use the new HookContainer/HookRunner system - https://phabricator.wikimedia.org/T338444 (10elukey) a:03isarantopoulos [14:52:14] bentoML is just a different tool - I mean it is doing the same job [14:53:22] the openLLM library is a nice approach and it is similar to what we do with our model servers, they just have put everything nicely in a python package that you can easily use and then integrated with bentoML [15:16:44] Ah, so just a different flavor. But diversity is good! [15:37:56] (03PS1) 10DCausse: events: propagate the event time with the dt field [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929735 (https://phabricator.wikimedia.org/T267648) [15:38:55] (03CR) 10Elukey: Unify the meta subfield in events (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [15:44:20] (03CR) 10DCausse: Unify the meta subfield in events (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [15:52:08] isaranto: didn't check before but https://grafana.wikimedia.org/d/ZAX3zaIWz/amd-rocm-gpu?orgId=1&var-source=eqiad%20prometheus%2Fops&var-instance=ml-serve1001:9100&from=now-2d&to=now [15:52:54] bloom-3b seems to use ~75% of the VRAM available [15:53:15] so I guess that falcon likely exceeds the total available.. [15:53:29] I am wondering though if in the pytorch code we can see an error from the GPU [15:53:38] I didn't find anything in the pytorch docs [15:53:55] like, if you fail to load a model to the gpu, it would be nice to get an exception or similar [16:01:09] Heading out for the day, folks \o [16:01:37] o/ [16:03:36] afk as well, have a nice rest of the day folks! [16:13:44] o/ [16:19:34] (03PS1) 10Ilias Sarantopoulos: feat: add Response Models in ores-legacy API [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929743 (https://phabricator.wikimedia.org/T330414) [16:19:42] (03CR) 10CI reject: [V: 04-1] feat: add Response Models in ores-legacy API [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929743 (https://phabricator.wikimedia.org/T330414) (owner: 10Ilias Sarantopoulos) [16:21:37] (03PS2) 10Ilias Sarantopoulos: feat: add Response Models in ores-legacy API [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929743 (https://phabricator.wikimedia.org/T330414) [16:22:19] going afk as well o/ [16:23:14] (03CR) 10Ottomata: Unify the meta subfield in events (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/929690 (owner: 10DCausse) [19:47:21] 10Machine-Learning-Team, 10ORES, 10Advanced-Search, 10All-and-every-Wikisource, and 64 others: Remove unnecessary targets definitions - https://phabricator.wikimedia.org/T328497 (10SBisson)