[07:12:42] <wikibugs>	 (03CR) 10Kevin Bazira: "Thank you for structuring the code to accommodate running tests for a specified model-server." [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993078 (https://phabricator.wikimedia.org/T355394) (owner: 10Ilias Sarantopoulos)
[07:56:32] <isaranto>	 Good morning \o/
[08:55:27] <wikibugs>	 10Machine-Learning-Team, 10Wikipedia-Android-App-Backlog (Android Release - FY2023-24): Migrate Machine-generated Article Descriptions from toolforge to liftwing. - https://phabricator.wikimedia.org/T343123 (10kevinbazira) @Seddon, in T353127 we were able to make significant improvements in response latency. F...
[09:30:18] <klausman>	 Morning!
[09:30:39] <klausman>	 ml-serve2004 fell over on the weekend, but is now back. Currently investigating if I can find a cause
[09:35:51] <isaranto>	 hey Tobias!
[11:33:26] <wikibugs>	 10Machine-Learning-Team: Debug GPU deployments on ml-staging - https://phabricator.wikimedia.org/T356038 (10isarantopoulos)
[11:34:01] <wikibugs>	 10Lift-Wing, 10Machine-Learning-Team: Debug GPU deployments on ml-staging - https://phabricator.wikimedia.org/T356038 (10isarantopoulos)
[11:50:21] <aiko>	 o/
[11:52:11] <isaranto>	 hey aiko!
[12:02:02] * isaranto lunch!
[12:04:15] <klausman>	 ditto
[12:44:21] <wikibugs>	 10Machine-Learning-Team: Test revertrisk-multilingual with GPU - https://phabricator.wikimedia.org/T356045 (10achou)
[14:05:42] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: "Both of your comments are totally valid! However at the moment we just setup the example of how to work and these details will be finalize" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993078 (https://phabricator.wikimedia.org/T355394) (owner: 10Ilias Sarantopoulos)
[14:16:15] <wikibugs>	 (03PS5) 10Ilias Sarantopoulos: locust: save separate results file per model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993078 (https://phabricator.wikimedia.org/T355394)
[14:17:05] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: "I have updated the patch and set run time to 60s and modified the endpoints to be compatible with both internal and external endpoints." [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993078 (https://phabricator.wikimedia.org/T355394) (owner: 10Ilias Sarantopoulos)
[14:22:29] <wikibugs>	 10Machine-Learning-Team, 10Add-Link, 10Growth-Team, 10Chinese-Sites, 10CommRel-Specialists-Support (Oct-Dec-2023): Support languages whose add-a-link models were not published - https://phabricator.wikimedia.org/T309263 (10AKhatun_WMF)
[14:35:21] <chrisalbon>	 Good morning all
[14:35:40] <klausman>	 heyo chris
[15:09:59] <isaranto>	 o\
[15:16:13] <isaranto>	 I hope the attempt to add a GPU to article desc works 🤞 
[15:16:30] <isaranto>	 if any of you have a moment https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/993707
[15:19:29] <klausman>	 Looking
[15:20:16] <klausman>	 LGTMd
[15:21:11] <isaranto>	 Danke! hope it works
[15:28:55] <isaranto>	 ah, it isn't much faster so I suspect that the GPU is not utilized properly
[15:29:26] <klausman>	 let me open radeontop and then you make some queries, that should answer it
[15:29:38] <klausman>	 ok, go
[15:30:38] <klausman>	 fwiw, I see only 7M of VRAM used, so it's likely not using the GPU, unless it loads the model ondemand
[15:36:31] <isaranto>	 lol yes
[15:38:26] <isaranto>	 I can check from grafana as well https://grafana.wikimedia.org/d/ZAX3zaIWz/amd-rocm-gpu
[15:38:40] <isaranto>	 well, I'll need to go through the code to check
[15:38:48] <isaranto>	 thanks Tobias!
[15:46:01] <klausman>	 With your new attach powers, you could also try and start a python interpreter, see if there's any permission problem or similar
[15:51:07] <isaranto>	 aha! correct. although since we already use it in other isvcs there shouldn't be any such scenario
[15:55:32] <isaranto>	 taking your advice however I did that and it seems that torch cant find the gpu. I attached a shell , run the python intepreter and then `import torch;torch.cuda.is_available()` which should be True but I got false :(
[15:58:56] <isaranto>	 ok I figured it out! it is because the image is using the cpu version of pytorch
[16:01:18] <wikibugs>	 (03PS1) 10Ilias Sarantopoulos: article-descriptions: add GPu version of pytorch [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993715
[16:01:43] <isaranto>	 this will do it --^
[16:15:51] <wikibugs>	 (03CR) 10Klausman: [C: 03+1] article-descriptions: add GPu version of pytorch [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993715 (owner: 10Ilias Sarantopoulos)
[16:28:15] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: [C: 03+2] article-descriptions: add GPu version of pytorch [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993715 (owner: 10Ilias Sarantopoulos)
[16:29:03] <wikibugs>	 (03Merged) 10jenkins-bot: article-descriptions: add GPu version of pytorch [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993715 (owner: 10Ilias Sarantopoulos)
[16:30:40] <wikibugs>	 (03CR) 10Kevin Bazira: "Ack" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993078 (https://phabricator.wikimedia.org/T355394) (owner: 10Ilias Sarantopoulos)
[16:55:49] <isaranto>	 kevinbazira: I'll check the above and let you know. However it seems normal if you tried to use the internal API from your localhost
[16:56:02] <isaranto>	 I have one more patch for today https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/993729
[16:56:24] <isaranto>	 going afk for the day. will join a bit later just to test the GPU. cu folks!
[16:57:43] <kevinbazira>	 isaranto: sure sure, I've +1'd.
[16:57:54] <kevinbazira>	 Enjoy your evening. o/
[17:22:36] <klausman>	 heading out now as well, \o
[18:02:04] <chrisalbon>	 night!
[18:21:17] <wikibugs>	 (03PS6) 10Ilias Sarantopoulos: locust: save separate results file per model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/993078 (https://phabricator.wikimedia.org/T355394)
[18:30:55] <isaranto>	 hmm I can't use the GPU because it is being used by the previous revision of article-descriptions model. So I figured out I also need to have permissions to delete revisions in ml-staging. I'll deal with it tomorrow then. o/
[18:32:34] <chrisalbon>	 ruh roh
[18:34:21] <isaranto>	 chrisalbon: TIL about "ruh roh" 😛
[18:34:27] <isaranto>	 I bet kids still love it
[18:34:58] <chrisalbon>	 ha, true
[19:49:39] <wikibugs>	 10Machine-Learning-Team, 10Research: Explore using revertrisk language agnostic API in a pre-save context - https://phabricator.wikimedia.org/T356102 (10kostajh)
[19:57:03] <wikibugs>	 10Machine-Learning-Team, 10Research: Explore using revertrisk language agnostic API in a pre-save context - https://phabricator.wikimedia.org/T356102 (10kostajh)
[22:28:10] <wikibugs>	 10Machine-Learning-Team, 10artificial-intelligence, 10Research ideas: [Epic] Paid editing (COI) detection model - https://phabricator.wikimedia.org/T120170 (10Harej)