[07:19:47] o/ good morning! [09:49:17] 06Machine-Learning-Team, 06Content-Transform-Team, 06Research: Add Article Quality Model to LiftWing - https://phabricator.wikimedia.org/T360455#9948695 (10isarantopoulos) Great @Isaac ! I was wondering if you'd be open to use a gradient boosting regressor model (xgboost, catboost, lightgbm) so that we don'... [10:52:54] kevinbazira: o/ thank you for the reviews :) I'll be merging the branches one by one - each of them requires a merge/rebase as they all touch the same line of code in __init__.py so I'll be re-requesting reviews [10:54:32] isaranto: o/ sure, I noticed rebases would be required. especially if you merge them in the order 11 to 20 :) [10:58:06] I started with this one https://github.com/wikimedia/liftwing-python/pull/20 .I'll leave the articletopic last because it requires some PR changes to add additional features (the optional fields you mentioned) [10:58:13] thank you :D [11:19:05] * isaranto lunch! [12:08:52] (03PS12) 10Rockingpenny4: Adds article topic model to ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1035044 (https://phabricator.wikimedia.org/T218132) [12:30:00] Good morning all [12:33:13] o/ Chris [12:47:59] it is wildfire season already :( [12:54:18] In ealry July? oh my [12:55:08] 06Machine-Learning-Team, 10Education-Program-Dashboard: Program & Events dashboard is using the old ORES service - https://phabricator.wikimedia.org/T352934#9949390 (10Aklapper) 05Declined→03Invalid [12:56:13] 06Machine-Learning-Team, 10Education-Program-Dashboard: Program & Events dashboard is using the old ORES service - https://phabricator.wikimedia.org/T352934#9949377 (10Aklapper) 05Open→03Declined a:05calbon→03None Unfortunately closing this Phabricator task as no further information has been provid... [12:59:54] unfortunately yes... [13:01:05] klausman: is there a dashboard I can use to check GPU VRAM usage for ml-staging? [13:01:32] I know this one https://grafana-rw.wikimedia.org/d/ZAX3zaIWz/amd-rocm-gpu?orgId=1&var-instance=&var-source=codfw%20prometheus%2Fops and also found this https://grafana-rw.wikimedia.org/d/d10408b0-518d-47d5-a879-81884b73d7dc/klausman-ml-amd-rocm-gpu?orgId=1&var-instance=&var-source=codfw%20prometheus%2Fk8s-mlstaging [13:01:52] iirc you mentioned that we need to change some things for the new GPU, right? [13:02:04] sec [13:02:42] ah, there is something funky with those dashboards not listing the ml-staging machine [13:04:04] https://thanos.wikimedia.org/graph?g0.expr=amd_rocm_gpu_memory_used_bytes%7Bcluster%3D%22ml_staging%22%2C%20memtype%3D%22vram%22%7D&g0.tab=0&g0.stacked=0&g0.range_input=1m&g0.max_source_resolution=0s&g0.deduplicate=1&g0.partial_response=0&g0.store_matches=%5B%5D should work in a pinch [13:06:58] The dashboard with my name in the title should now also work [13:09:36] ack, thanks! [13:23:56] also a heads up: at the same time that our meeting starts, the network folks will upgrade a switch, so I will drain&cordon ml-serve1005 just before that. There'll be a bit of shuffling of pods, but otherwise it should be invisible. [13:25:14] ack [14:07:05] (03PS2) 10Kosta Harlan: revertrisk: Clarify to use -1 if revision doesn't exist [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1051347 [15:11:01] (03CR) 10Ilias Sarantopoulos: [V:03+2 C:03+2] "LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1051347 (owner: 10Kosta Harlan) [15:31:18] thanks for the review isaranto. [15:31:19] going to deploy articlequality with the `MAX_FEATURE_VALS` env var [15:32:57] <3 [15:35:05] predictor is running into `CrashLoopBackOff` issue [15:36:07] any other error msg? try kubectl describe pod [15:37:50] I checked the pod logs they say `FileNotFoundError: [Errno 2] No such file or directory: 'data/max-values-html-dumps-ar-en-fr-hu-tr-zh.tsv'` [15:38:27] yes found the error: https://phabricator.wikimedia.org/P65745 [15:42:39] working on fixing this path [15:45:53] ack! it is not urgent though [15:52:51] I mean if you wanna do it tomorrow that's fine, there is no service depending on it. otherwise I'm here if you need any help [16:00:05] sure, I've pushed a fix here: https://gerrit.wikimedia.org/r/1051799 [16:01:11] nice, thanks! +1'ed [16:36:15] thanks it's now up and running: https://phabricator.wikimedia.org/P65745#263380 \o/ [16:38:09] niice [16:38:33] I just saw that the filename was wrong (values instead of vals) [16:38:57] yep :) [16:39:00] did you fix it on the isvc directly? [16:39:29] I missed that in the review. sorryyy:) [16:40:04] no worries. it's getting towards the end of my day ... I had to fix it directly [16:40:24] cool, we can follow up tomorrow! thanks Kevin! [16:40:27] night kevin! [16:40:37] sure sure. thanks for the help :) [16:40:38] have a great evening [16:40:49] have a good evening o/ [16:41:10] just figured out why gemma-27b wasn't giving any output. it now works 🎉 [16:41:19] oh yeah? [16:41:56] https://phabricator.wikimedia.org/P65753 [16:42:25] awesome! [16:42:54] it had to do with the data types. model needs bfloat16 otherwise it defaults to float32 and for some reason I haven't debugged yet was returning an empty response [16:43:35] in the example above it is still llama3 blah blah cause I was testing directly on staging. I'll write a patch to make the proper deployment [16:43:51] this is all good news! [16:44:25] definitely! [16:50:01] game on! [16:50:38] added the patch -> https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1051806 the model is already deployed so we can deal with this patch tomorrow [16:50:48] I'm logging off as well folks, have a nice rest of day evening! [16:56:20] night isaranto! [19:27:08] (03CR) 10Sohom Datta: "@isarantopoulos@wikimedia.org Is there a mechanism to provide a revision ID to this new ORES model ? Using only the article name seems lik" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1035044 (https://phabricator.wikimedia.org/T218132) (owner: 10Rockingpenny4) [20:26:26] (03PS1) 10Kosta Harlan: [WIP] Add AbuseFilter variable for revertrisk score [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) [20:28:53] (03CR) 10CI reject: [V:04-1] [WIP] Add AbuseFilter variable for revertrisk score [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) (owner: 10Kosta Harlan) [20:30:28] 06Machine-Learning-Team: Allow calling revertrisk language agnostic and revert risk multilingual APIs in a pre-save context - https://phabricator.wikimedia.org/T356102#9951640 (10kostajh) @achou I tested out the language-agnostic endpoint via [WIP Add AbuseFilter variable for revertrisk score](https://gerrit.wik... [20:43:28] (03PS2) 10Kosta Harlan: [WIP] Add AbuseFilter variable for revertrisk score [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) [20:45:56] (03CR) 10CI reject: [V:04-1] [WIP] Add AbuseFilter variable for revertrisk score [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) (owner: 10Kosta Harlan) [20:46:30] (03PS3) 10Kosta Harlan: [WIP] Add AbuseFilter variable for revertrisk score [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) [20:49:02] (03CR) 10CI reject: [V:04-1] [WIP] Add AbuseFilter variable for revertrisk score [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) (owner: 10Kosta Harlan)