[08:47:43] (03CR) 10AikoChou: revert-risk: fix session host for the wikidata model (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [08:57:49] * elukey errand for a bit [09:25:38] 10Machine-Learning-Team, 10Data-Engineering-Planning, 10Event-Platform Value Stream: Add a new outlink topic stream for EventGate main - https://phabricator.wikimedia.org/T328899 (10achou) @pfischer As far as I know, the Outlink topic model does not use redirect information to predict the topic of an article... [09:25:54] (03CR) 10Ilias Sarantopoulos: [C: 03+1] "LGTM! Just added a nit, feel free to do whatever you want" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [09:36:46] (03PS3) 10AikoChou: revert-risk: fix session host for the wikidata model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) [09:38:59] (03CR) 10AikoChou: [C: 03+2] "Thanks for the review :)" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [09:44:44] (03Merged) 10jenkins-bot: revert-risk: fix session host for the wikidata model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [10:46:53] (03PS1) 10Ilias Sarantopoulos: LLM: model server example with bloom [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919293 (https://phabricator.wikimedia.org/T333861) [10:48:42] I added the patch for the LLM/bloom model server. Nothing to do for now just uploaded it to keep it as a reference [10:56:04] nice :) [10:59:22] when I find some time I'll jump back to this as the issue is that on M1 host with CPU request takes 5 seconds. while dockerized it takes 3.5 minutes and I'm definitely missing something [10:59:41] back to working on ores extension for now! [11:46:26] * isaranto lunch [13:32:40] I figured out the issue with the 3.5 minutes response from dockerized model server: the docker I was running it from was not running natively on aarch64. Solved by adding arm64v8/python:3.9-bullseye as a base image and response time is down to 5s [14:11:25] ah interesting!! [14:49:47] I will be adding a deployment in the experimental namespace [14:50:34] if that is ok with everyone. in terms of resources it must be pretty close to the revertrisk model in the same namespace so I'll use the same [14:55:40] (03CR) 10Ilias Sarantopoulos: "Will wait for the CI pipelines to be merged first" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919293 (https://phabricator.wikimedia.org/T333861) (owner: 10Ilias Sarantopoulos) [14:59:27] (03PS4) 10Ilias Sarantopoulos: LLM: model server example with bloom [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919293 (https://phabricator.wikimedia.org/T333861) [14:59:49] isaranto: is bloom fully open? [15:02:24] my only reservation for staging/experimental would be the model's license [15:04:36] define fully :) [15:05:31] license wise :) do we have restrictions in uploading it in our infra? [15:05:37] this is the license https://huggingface.co/spaces/bigscience/license [15:05:37] it has some use restrictions [15:05:40] aa no [15:06:00] mmmmm [15:06:18] this is another description of the license https://bigscience.huggingface.co/blog/the-bigscience-rail-license [15:06:18] and quoting from the FAQ : ```Is this an open source license? This is not an open source license according to the Open Source Initiative definition, because it has some restrictions on the use of the model. That said, it does not impose any restrictions on reuse, distribution, commercialization, adaptation as long as the model is not being applied towards use-cases that have been restricted.``` [15:07:28] we'd need to follow up with somebody on this, maybe Moritz [15:07:30] to be sure [15:08:35] the use restrictions are in Attachment A on the license and mostly refer to unethical usage of the model/API (e.g. using it to impersonate others, discriminate, harass etc) [15:10:08] cool cool. there is no hurry in this. wish I had worked on this earlier to have it for the hackathon :) [15:10:38] lemme do some follow up and research on this [15:10:41] before merging [15:10:57] if we find any issue with this though we can focus our efforts to open source models like the one Isaac suggested - https://huggingface.co/bigscience/mt0-base [15:10:57] which comes with an Apache2.0 license [15:13:34] (03PS22) 10Ilias Sarantopoulos: feat: use Lift Wing instead of ORES [extensions/ORES] - 10https://gerrit.wikimedia.org/r/910439 (https://phabricator.wikimedia.org/T319170) [15:14:44] I switched the ORes extension to use the external endpoint so we can test it [15:15:45] will w8 until Monday to see if we can test it through patchdemo, otherwise I will suggest to merge it and test it on the beta cluster [15:18:34] \o/ [15:18:40] great work [15:54:10] isaranto: sorry just read the bottom of https://huggingface.co/spaces/bigscience/license, the restrictions are very general sometimes and they smell a bit weird. I get their intent but it may not be compatible with our policies [15:54:28] I'll ping Moritz in the task to kick off the conversation [15:54:47] Cool cool cool! [15:54:57] Apache 2.0 solves everything [15:55:33] yes definitely, but is mt0-base similar to bloom? I mean from performance / quality / etc... [15:56:12] 10Machine-Learning-Team, 10Data-Engineering, 10Research, 10Event-Platform Value Stream (Sprint 12): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10JArguello-WMF) @achou is it ok if I close this ticket? [16:00:04] ok done! [16:00:38] 10Machine-Learning-Team, 10Patch-For-Review: Host open source LLM (bloom, etc.) on Lift Wing - https://phabricator.wikimedia.org/T333861 (10elukey) @MoritzMuehlenhoff Hi! We are trying to host a LLM model on our infrastructure, and one of the candidates is [[ https://huggingface.co/docs/transformers/model_doc/... [16:26:42] heading out for the weekend folks! [16:26:49] have a nice end of the week o/ [16:35:02] 10Lift-Wing, 10Machine-Learning-Team: Deploy Revert-risk wikidata model to ml-staging - https://phabricator.wikimedia.org/T333125 (10achou) The model has been uploaded: ` aikochou@stat1005:~$ s3cmd -c /etc/s3cmd/cfg.d/ml-team.cfg ls s3://wmf-ml-models/experimental/revertrisk-wikidata/20230512162400/ 2023-05-1... [16:57:43] performance and quality is not an issue at this point, we are deploying "smallish" models and not the big versions just to figure out any issues we may face [16:58:40] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk language agnostic model from staging to production - https://phabricator.wikimedia.org/T332998 (10achou) a:03achou [16:59:34] logging off as well! cu folks o/ [16:59:39] <3 [17:02:17] bye Ilias :) have a nice weekend! [17:27:56] 10Machine-Learning-Team, 10Data-Engineering, 10Research, 10Event-Platform Value Stream (Sprint 12): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10achou) @JArguello-WMF Yep, it's finished. Thank you! :) [17:28:59] 10Machine-Learning-Team, 10Data-Engineering, 10Research, 10Event-Platform Value Stream (Sprint 12): Design event schema for ML scores/recommendations on current page state - https://phabricator.wikimedia.org/T331401 (10JArguello-WMF) 05Open→03Resolved [17:29:02] 10Machine-Learning-Team, 10Platform Team Workboards (Platform Engineering Reliability): Implement new mediawiki.revision-score streams with Lift Wing - https://phabricator.wikimedia.org/T328576 (10JArguello-WMF) [17:29:05] 10Machine-Learning-Team, 10Data-Engineering, 10Event-Platform Value Stream, 10Research: Proposal: Create a stream end point for Revision Risk Model - https://phabricator.wikimedia.org/T326179 (10JArguello-WMF) [17:49:26] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk multilingual model from staging to production - https://phabricator.wikimedia.org/T333124 (10achou) The model has been moved to a new bucket: ` aikochou@stat1005:~$ s3cmd -c /etc/s3cmd/cfg.d/ml-team.cfg ls s3://wmf-ml-models/revertrisk/multilingual/202303... [17:50:24] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk language agnostic model from staging to production - https://phabricator.wikimedia.org/T332998 (10achou) The model has been moved to a new bucket: ` aikochou@stat1005:~$ s3cmd -c /etc/s3cmd/cfg.d/ml-team.cfg ls s3://wmf-ml-models/revertrisk/language-agnos... [19:42:33] 10Machine-Learning-Team, 10DBA, 10Data-Platform-SRE, 10Infrastructure-Foundations, and 9 others: codfw row D switches upgrade - https://phabricator.wikimedia.org/T335042 (10colewhite)