[06:30:06] hello! [06:38:11] Good morning ☀️ [06:49:47] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Investigate Add-a-link model training and deployment - https://phabricator.wikimedia.org/T393474#10803273 (10OKarakaya-WMF) [06:50:25] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Investigate Add-a-link model training and deployment - https://phabricator.wikimedia.org/T393474#10803274 (10OKarakaya-WMF) a:03OKarakaya-WMF [07:31:02] kevinbazira: o/ [07:31:19] isaranto: o/ [07:31:43] for the vllm image I meant to create an MR with the whole dockerfile so that ppl can review add comments etc. [07:31:48] is it ok if I create it? [07:32:33] sure sure no problem. [07:40:02] https://gitlab.wikimedia.org/repos/machine-learning/wmf-debian-vllm/-/merge_requests/2 [07:40:50] sorry for messing around with your work, I just want to provide an easy way to add and resolve comments inline [07:57:55] np! your MR is indeed easier to review :) [08:03:57] if you can update the MR description that would be great, thank you! [08:10:17] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Investigate Add-a-link model training and deployment - https://phabricator.wikimedia.org/T393474#10803358 (10kevinbazira) We have the context below in the [[ https://wikimedia.slack.com/archives/G01A0FNPLG4/p1746010463511419 | ML team slack channel ]] but sharing... [08:33:26] * isaranto afk for a bit [09:11:29] * isaranto back [09:20:18] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Investigate Add-a-link model training and deployment - https://phabricator.wikimedia.org/T393474#10803610 (10OKarakaya-WMF) Hey, @fkaelin. We have recently started investigation about the add-a-link model. Please feel free to add more folks from your team if rel... [10:23:37] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Investigate Add-a-link model training and deployment - https://phabricator.wikimedia.org/T393474#10803757 (10OKarakaya-WMF) Hey, @Michael. We have recently started investigation about the add-a-link model. Please feel free to add more folks from your team if rel... [10:29:14] Hey, I’ve added some questions/folks to the add-a-link goal above on Phabricator. Please feel free to add more folks/questions. We will follow this up in the meetings today and Monday. [10:30:05] Related to the teams we collaborate. [11:16:39] ack, thanks Ozge! [11:50:09] 06Machine-Learning-Team, 10Editing-team (Tracking): Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10804000 (10gkyziridis) >>! In T393154#10798989, @isarantopoulos wrote: > Let's create a plan and test some things in order to debug this. I'm star... [12:04:08] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Operational Excellence - LiftWing Platform Updates & Improvements - https://phabricator.wikimedia.org/T391943#10804044 (10isarantopoulos) [12:04:57] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Operational Excellence - LiftWing Platform Updates & Improvements - https://phabricator.wikimedia.org/T391943#10804049 (10isarantopoulos) @elukey sorry for the late response. I have changed the description and accepted your proposal. Thanks! [12:07:51] 06Machine-Learning-Team, 05Goal: Q4 24-25 Goal: Operational Excellence - LiftWing Platform Updates & Improvements - https://phabricator.wikimedia.org/T391943#10804071 (10isarantopoulos) [12:07:55] 06Machine-Learning-Team: ML Services causing log spam - https://phabricator.wikimedia.org/T393475#10804072 (10isarantopoulos) [12:07:58] 06Machine-Learning-Team, 13Patch-For-Review: Migrate all Lift Wing k8s workers to Bookworm and containerd - https://phabricator.wikimedia.org/T387854#10804073 (10isarantopoulos) [12:08:00] 06Machine-Learning-Team, 06Data-Platform-SRE, 10Prod-Kubernetes, 06serviceops, 07Kubernetes: Update kserve to v0.13.0 on ML clusters - https://phabricator.wikimedia.org/T380722#10804075 (10isarantopoulos) [12:08:07] 06Machine-Learning-Team, 06Data-Platform-SRE, 10Prod-Kubernetes, 06serviceops, and 2 others: Update knative-serving+net-istio to v1.12.x on ML clusters - https://phabricator.wikimedia.org/T380723#10804074 (10isarantopoulos) [12:08:13] 06Machine-Learning-Team, 07Kubernetes, 13Patch-For-Review: Migrate ml-staging/ml-serve clusters off of Pod Security Policies - https://phabricator.wikimedia.org/T369493#10804076 (10isarantopoulos) [12:11:13] klausman: elukey: https://phabricator.wikimedia.org/T391958#10794576 - is this good to go? [12:48:12] kart_: LGTM, somebody from ML now needs to push this to Thanos/S3 via the automation tool (that will also publish it to analytics.wikimedia.org) [12:48:50] then it should be a matter of configuring the user/pass as env variables for mint's deployment config [12:49:03] and you'll be good to test it in staging etc... [12:49:11] not sure if you already have the code ready [12:54:16] No code yet. Can you also point me any similar services doing same? Since MinT isn't on 'Lift Wing' - things will be different. [13:52:26] isaranto: klausman: o/ thanks for the reviews. I've addressed all the comments on the wmf-debian-vllm image MR: https://gitlab.wikimedia.org/repos/machine-learning/wmf-debian-vllm/-/merge_requests/2 [13:52:26] please take a look and let me know if there are any additional suggestions or if it's ready to be approved. thanks! [14:00:21] kart_: I am not sure if there is a good example for this use case, it is one of the firsts. One thing that could be done is a simple bash script that acts as entrypoint for mint, uses s3cmd or similar to pull the models from S3 and then starts the python UWSG server. Or maybe something directly in Python that fetches from S3 run only once when starting up. [14:17:32] elukey: https://s3tools.org/ - right? [14:18:55] kart_: yes is packaged in debian already - https://packages.debian.org/bookworm/s3cmd [14:19:03] or you could do it in python using boto for example [14:19:16] both will read AWS_ env variables for credentials [14:19:24] Noted. [14:19:27] those will be exposed via private config to the pod, etc.. [14:19:34] Tobias will take care of it [14:20:18] Thanks! [14:20:54] I'm having some downtime (backpain!) since last two days, so I didn't followup much. I should be back on track from tomorrow. [14:28:23] please take care! :) [14:49:35] (03CR) 10Nik Gkountas: [C:03+1] "The code looks good and works well. The patch also drops support for space-separated topics (e.g. `topic=business and economics`), which i" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) (owner: 10Sbisson) [15:28:30] (03PS7) 10Sbisson: Support for articlecountry [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) [15:28:39] (03CR) 10Sbisson: "> The patch also drops support for space-separated topics" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) (owner: 10Sbisson) [15:29:58] (03CR) 10CI reject: [V:04-1] Support for articlecountry [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) (owner: 10Sbisson) [15:44:52] (03CR) 10Sbisson: "recheck" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) (owner: 10Sbisson) [15:52:38] (03PS1) 10Sbisson: Popular/search recommander: use domain code in lllang parameter [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1143605 (https://phabricator.wikimedia.org/T306508) [15:53:21] (03CR) 10CI reject: [V:04-1] Popular/search recommander: use domain code in lllang parameter [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1143605 (https://phabricator.wikimedia.org/T306508) (owner: 10Sbisson) [15:55:05] hey folks, as FYI we are testing the perfs of https://slo.wikimedia.org/?search=revertrisk [15:56:00] (03PS2) 10Sbisson: Popular/search recommander: use domain code in lllang parameter [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1143605 (https://phabricator.wikimedia.org/T306508) [16:30:39] ack, thanks Luca, this looks really nice! [17:12:41] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 06DBA, 10MediaWiki-Recent-changes, 06Moderator-Tools-Team: DBA Review of Tables that ORES Extension will create - https://phabricator.wikimedia.org/T391103#10805366 (10jsn.sherman) [18:42:43] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 06DBA, 10MediaWiki-Recent-changes, and 2 others: DBA Review of Tables that ORES Extension will create - https://phabricator.wikimedia.org/T391103#10805630 (10jsn.sherman) >>! In T391103#10713983, @Ladsgroup wrote: > On the side of table catalog. I sugge... [19:42:46] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805795 (10Eevans) [19:45:08] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805798 (10Eevans) @BWojtowicz-WMF can I get you t... [19:50:05] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805808 (10Eevans) @thcipriani Ok to add to deploy... [19:59:38] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805870 (10thcipriani) >>! In T393595#10805807, @E... [20:24:35] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805914 (10Eevans) [20:24:45] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805915 (10Eevans) 05Open→03In progress [20:24:58] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10805916 (10Eevans) p:05Triage→03Medium