[06:28:06] wasuze otya o/ [07:26:08] good morning o/ [07:46:12] hey Aiko! [07:48:41] I've found sth regarding the GPU vram. I'm running some basic load tests on gemma2-8b from statbox (various tests with 1-2 users for 3-10 minutes) and VRAM usage explodes https://grafana.wikimedia.org/goto/g7t5UQuSg?orgId=1 [07:49:08] there is an already unexpected bump from 17GB to 20GB, but then it goes to 50+GB [07:49:24] I'll have a bit more info in the standup [07:54:00] isaranto: nasuze bulungi :) [07:54:03] Καλημέρα [07:54:17] aiko: 早上好 [07:54:25] :D [08:18:11] kevinbazira: 早! :D [08:35:02] 10Lift-Wing, 06Machine-Learning-Team, 13Patch-For-Review: [LLM] add locust entry for huggingfaceserver - https://phabricator.wikimedia.org/T370992#10017218 (10isarantopoulos) I ran a load test with the folowing setup: duration: 10 minutes users: 2 output_size(max_tokens): 10-200 prompt_input_size (# words):... [08:49:28] Morning! [08:50:23] Guten Morgen! [10:14:35] this is nice https://openai.com/index/searchgpt-prototype/ [10:14:42] kevinbazira: seems like wikigpt :) [10:15:04] * isaranto afk lunch [10:25:22] https://www.irccloud.com/pastebin/DBihuFDl [10:26:00] * klausman lunch [10:26:40] isaranto: yep it does :) [10:26:47] more competition for https://www.perplexity.ai/ [12:52:48] 10Lift-Wing, 06Machine-Learning-Team, 13Patch-For-Review: [LLM] add locust entry for huggingfaceserver - https://phabricator.wikimedia.org/T370992#10017759 (10isarantopoulos) It seems that the above behavior with the increased memory usage is a standard thing. I redeployed the service and was using 18GB of... [12:59:57] I'm going to deploy another model aya-23-8B and see if this memory behavior persists - https://huggingface.co/CohereForAI/aya-23-8B [13:00:22] (03CR) 10Ilias Sarantopoulos: [C:03+2] docs: add info how to use a newly released hf model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1056504 (owner: 10Ilias Sarantopoulos) [13:00:42] (03CR) 10Ilias Sarantopoulos: [V:03+2 C:03+2] docs: add info how to use a newly released hf model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1056504 (owner: 10Ilias Sarantopoulos) [13:07:58] klausman: o/ as FYI the reimage cookbook has a new flag called "--force-dhcp-tftp" that forces pxelinux.0 and tftp [13:08:08] it could be useful for the new supermicro nodes [13:08:11] ohm, nice, ty! [13:08:21] I was wondering about that :) [13:11:19] If we ever _need_ a version of lxpelinx.0 that works with the broken FW, I still have my hacky-patch [13:11:48] I think that the best long term action is probably to test EFI [13:12:12] Ack. [13:12:26] buuut a lot of work :) [13:12:37] I considered giving that ago with the new ML hosts, but I think it's better to try it with the test machine :) [13:13:19] Basically, time-limited test with the SMC machine and if it doesn't work after an afternoon of work (even just as a POC), abandon it [13:14:02] mmm not sure if it could work with the current tftp setup that we have [13:14:12] we'd need the efi binaries right? Expose them, etc.. [13:14:18] Yeah. [13:14:36] plus the right dhcp config etc. (now spicerack offers a way to select filename and options) [13:14:40] (so it should be easier) [13:14:42] I have plans to try EFI-boot-from-net in my homelab sometime soon. At least to get familiar woth all the components [13:22:00] (03PS3) 10Ilias Sarantopoulos: (WIP) locust entry for hf [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1056872 (https://phabricator.wikimedia.org/T370992) [13:26:03] 06Machine-Learning-Team, 10Structured-Data-Backlog (Current Work): Estimate the logo detection service's expected load - https://phabricator.wikimedia.org/T370756#10017802 (10mfossati) [13:30:49] 06Machine-Learning-Team, 10Structured-Data-Backlog (Current Work): Estimate the logo detection service's expected load - https://phabricator.wikimedia.org/T370756#10017805 (10mfossati) 05In progress→03Resolved @isarantopoulos @kevinbazira @klausman , please feel free to re-open if these numbers aren't... [13:33:47] (03PS4) 10Ilias Sarantopoulos: locust: entry for hf [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1056872 (https://phabricator.wikimedia.org/T370992) [13:44:46] 06Machine-Learning-Team, 10MW-1.43-notes (1.43.0-wmf.12; 2024-07-02), 13Patch-For-Review, 10Structured-Data-Backlog (Current Work): [SPIKE] Send an image thumbnail to the logo detection service within Upload Wizard - https://phabricator.wikimedia.org/T364551#10017840 (10isarantopoulos) @mfossati after our... [13:46:14] if anyone has time I'd like a review to deploy aya23 model. thanks! https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1057207 [13:47:11] actually I'll do it by manually editing the isvc in experimental but I opened the patch so the dep-charts are up2date [14:11:09] Morning all [14:11:38] good morning o/ [14:16:28] \o [14:22:19] (03CR) 10Kevin Bazira: [C:03+1] locust: entry for hf [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1056872 (https://phabricator.wikimedia.org/T370992) (owner: 10Ilias Sarantopoulos) [14:35:27] thanks for the reviews :D [14:36:03] aiko: the thing I mentioned about creating a script to simply run a model on liftwing with a different entrypoint seems that it would be tricky [14:36:56] I can do it for one model, but if we want to be able to load any hf model then we would need to read the transformers model class from config.json and load the specified class. [14:37:31] more or less we would end up doing what the huggingfaceserver does, so not a great idea from my side [14:38:03] but going to open a task so that we can find the easiest way to play around easily and fast [14:38:30] I do believe a jupyter notebook would still be the best way, but not on LW ofc [15:23:30] the memory increase doesn't seem to happen with aya23. memory usage is quite stable [15:26:33] 06Machine-Learning-Team, 10Structured-Data-Backlog (Current Work): [M] Create the logo detection model card - https://phabricator.wikimedia.org/T370759#10018325 (10MarkTraceur) [15:37:53] going afk folks, have a nice weekend! [16:06:14] isaranto: ack! [16:37:12] logging off as well! have a nice weekend folks :)