[07:00:12] Morning folks! [07:30:57] morning Ilias o/ [07:42:54] (03PS1) 10Kevin Bazira: Update pageviews external endpoint uri used on LiftWing [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) [08:16:02] morning! [08:16:41] I was helping the telco technician to get FTTH to my house but there were problems in the pipes, sigh [08:16:45] third time that they try [08:23:49] (03CR) 10Elukey: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:27:16] sounds like sth that could easily happen in athens too :) [08:32:26] 10Machine-Learning-Team: Upgrade Revert Risk Multilingual docker images to KServe 0.11 - https://phabricator.wikimedia.org/T347551 (10isarantopoulos) Actually there are differences compared to old load tests even when I run the tests for 10s like the ones @achou ran in the link above. ` wrk -c 1 -t 1 --timeout 5... [08:32:42] 10Machine-Learning-Team, 10Patch-For-Review: Configure envoy settings to enable rec-api-ng container to access endpoints external to k8s/LiftWing - https://phabricator.wikimedia.org/T348607 (10kevinbazira) As reported in T347475#9269007, the pageviews endpoint is failing. I reproduced this issue on staging and... [08:34:52] (03CR) 10Kevin Bazira: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:37:05] (03CR) 10Elukey: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:38:51] (03CR) 10Elukey: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:39:22] (03CR) 10Elukey: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:39:33] kevinbazira: o/ lemme know if it makes sense --^ [08:46:03] (03CR) 10Kevin Bazira: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:46:25] elukey: yes, it does make sense. thanks! [08:47:48] (03CR) 10Elukey: Update pageviews external endpoint uri used on LiftWing (031 comment) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:48:16] (03PS2) 10Kevin Bazira: Update pageviews external endpoint uri used on LiftWing [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) [08:50:29] (03CR) 10Elukey: [C: 03+1] Update pageviews external endpoint uri used on LiftWing [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:51:54] (03CR) 10Kevin Bazira: [C: 03+2] "Thanks for the reviews :)" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [08:53:15] (03Merged) 10jenkins-bot: Update pageviews external endpoint uri used on LiftWing [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/967916 (https://phabricator.wikimedia.org/T348607) (owner: 10Kevin Bazira) [09:42:33] (03CR) 10Ilias Sarantopoulos: "This change is ready for review." [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/967455 (https://phabricator.wikimedia.org/T349371) (owner: 10Ilias Sarantopoulos) [09:43:23] Finally managed to test llm image with nllb. Docker builds were giving me a hard time, since these images are big. You can now run it locally! [09:45:00] wow nice! [09:50:22] 10Machine-Learning-Team, 10Patch-For-Review: Refactor LLM class and model server to run locally - https://phabricator.wikimedia.org/T349371 (10isarantopoulos) a:03isarantopoulos [09:52:57] 10Machine-Learning-Team, 10Patch-For-Review: Refactor LLM class and model server to run locally - https://phabricator.wikimedia.org/T349371 (10isarantopoulos) I've refactored the llm model server and now one can run it locally as well (without using docker). In case someone wants to build the docker image loca... [10:02:25] 10Machine-Learning-Team, 10Patch-For-Review: Configure envoy settings to enable rec-api-ng container to access endpoints external to k8s/LiftWing - https://phabricator.wikimedia.org/T348607 (10kevinbazira) Since we couldn't experiment and check the correct pageviews uri from k8s, an SRE had to log into one of... [10:14:48] 10Machine-Learning-Team: Update to KServe 0.11 - https://phabricator.wikimedia.org/T337213 (10klausman) [10:24:29] kevinbazira: so all working now? [10:24:48] 10Machine-Learning-Team: Investigate recommendation-api-ng internal endpoint failure - https://phabricator.wikimedia.org/T347475 (10kevinbazira) Thank you for sharing the notes, @isaac. We were able to reproduce this issue in T348607#9275075 and fixed it in T348607#9275348. The rec-api-ng now returns results wh... [10:26:24] elukey: yes, it is. I deployed it on staging, tested that it works, and then deployed it to prod and have updated Isaac in: https://phabricator.wikimedia.org/T347475#9275384 [10:32:00] nice! [10:32:21] * isaranto lunch! [10:33:26] * elukey lunch! [11:33:57] 10Machine-Learning-Team, 10Patch-For-Review: Upgrade Revert Risk Language-agnostic docker images to KServe 0.11 - https://phabricator.wikimedia.org/T347550 (10isarantopoulos) cgroup v2 support is here in the latest xgboost patch release https://github.com/dmlc/xgboost/releases/tag/v2.0.1 ! [13:04:27] Hello [13:05:00] o/ [13:06:50] \o [13:12:42] I was in the office all day yesterday for the first time in forever [13:14:12] what is an "office"? 😛 [13:27:15] chrisalbon: o/ [13:27:26] in https://meta.wikimedia.org/wiki/Anti-Disinformation_Repository I see some mentions of ORES but not Lift Wing :( [13:27:59] I’ll talk to them [13:45:12] wow! https://diff.wikimedia.org/2023/10/24/open-language-identification-api-for-200-languages/ [13:46:27] stuff is happening! [13:46:40] folks I need to run an errand (pick my parents from the doctor, regular visit, nothing heavy) but I may be a little late to the team meeting [13:54:40] 10Lift-Wing, 10Machine-Learning-Team, 10I18n, 10NewFunctionality-Worktype, 10Patch-For-Review: Create a language detection service in LiftWing - https://phabricator.wikimedia.org/T340507 (10isarantopoulos) I ran some final load testing on localhost to test the limits of the current implementation. ` wrk... [13:54:52] cool, no worries! [14:09:38] 10Machine-Learning-Team, 10Goal: Goal: Increase the number of models hosted on Lift Wing - https://phabricator.wikimedia.org/T348156 (10calbon) Update: Language ID model deployed. Diff blog posted. Recommendation API work continues. [14:11:09] elukey: we might have to update the pageviews envoy settings, folks from #wikimedia-analytics reached out as shown in the screenshot below: https://usercontent.irccloud-cdn.com/file/L1EdHESz/Screenshot%20from%202023-10-24%2018-06-29.png [14:12:32] 10Machine-Learning-Team, 10Goal: Goal: Decide on an optional Lift Wing caching strategy for model servers - https://phabricator.wikimedia.org/T348155 (10calbon) Doc created. Two major options. [14:13:36] kevinbazira: answered in the chan.. we bypass restbase, so it should be ok. Let's see what they prefer. [14:13:52] thanks! [14:32:03] 10Machine-Learning-Team, 10Patch-For-Review: [CI] Update pre-commit versions in inf-services repo - https://phabricator.wikimedia.org/T349382 (10calbon) a:03isarantopoulos [14:34:24] 10Lift-Wing, 10Machine-Learning-Team: Discuss caching strategies for Lift Wing - https://phabricator.wikimedia.org/T349180 (10calbon) a:03klausman [14:38:08] 10Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10Documentation: Update docs for ORES Extension - https://phabricator.wikimedia.org/T346761 (10calbon) a:03calbon [14:41:17] 10Machine-Learning-Team, 10Wikipedia-Android-App-Backlog (Android Release - FY2023-24): Migrate Machine-generated Article Descriptions from toolforge to liftwing. - https://phabricator.wikimedia.org/T343123 (10calbon) a:05calbon→03kevinbazira [14:45:53] 10Machine-Learning-Team, 10Wikilabels: Update wikilabel's dependencies - https://phabricator.wikimedia.org/T325367 (10calbon) a:03calbon [14:54:33] 10Machine-Learning-Team: Add deprecation warnings to ORES-related repositories on Github - https://phabricator.wikimedia.org/T349632 (10klausman) [14:58:39] kevinbazira: ok so for rec-api-ng and pageviews: [14:58:54] 1) We should add the rest-gateway listener and remove the AQS one (deployment-charts) [14:59:21] 2) Change the Liftwing .ini file with the previous URI scheme (/v1/metrics/etc..) using the new port [15:00:47] yep yep, I plan to pick this up tomorrow. [15:00:53] I getting afk as of now. [15:00:54] have a good evening. [15:01:00] you too [15:08:37] good evening kevin! [15:16:34] 10Machine-Learning-Team, 10ORES, 10Beta-Cluster-Infrastructure, 10PageTriage: Special:NewPagesFeed broken on beta cluster testwiki - https://phabricator.wikimedia.org/T349635 (10Novem_Linguae) [15:31:20] I'll investigate the above tomorrow --^ [15:39:37] 10Machine-Learning-Team, 10ORES, 10Beta-Cluster-Infrastructure, 10PageTriage: Special:NewPagesFeed broken on beta cluster testwiki - https://phabricator.wikimedia.org/T349635 (10Soda) Based on a look at `/wmf-config/InitialiseSettings-labs.php:1453-1473` `lang=php 'wgOresModels' => [ 'default' => [... [15:53:30] 10Machine-Learning-Team, 10ORES, 10Beta-Cluster-Infrastructure, 10PageTriage: Special:NewPagesFeed broken on beta cluster testwiki - https://phabricator.wikimedia.org/T349635 (10TheresNoTime) hate to be //that person//, but is there a reason why y'all can't just use https://en.wikipedia.beta.wmflabs.org/wi... [15:57:11] 10Machine-Learning-Team, 10ORES, 10Beta-Cluster-Infrastructure, 10PageTriage: Special:NewPagesFeed broken on beta cluster testwiki - https://phabricator.wikimedia.org/T349635 (10jsn.sherman) >>! In T349635#9276952, @TheresNoTime wrote: > hate to be //that person//, but is there a reason why y'all can't jus... [16:43:52] going afk folks, have a nice rest of the day! [16:57:44] 10Machine-Learning-Team, 10ORES, 10Beta-Cluster-Infrastructure, 10PageTriage: Special:NewPagesFeed broken on beta cluster testwiki - https://phabricator.wikimedia.org/T349635 (10jsn.sherman) >>! In T349635#9276879, @Soda wrote: > Based on a look at `/wmf-config/InitialiseSettings-labs.php:1453-1473` > `lan... [20:35:57] 10Machine-Learning-Team, 10ORES: Add deprecation warnings to ORES-related repositories on Github - https://phabricator.wikimedia.org/T349632 (10Aklapper) [21:23:12] 10Machine-Learning-Team, 10ORES, 10Beta-Cluster-Infrastructure, 10PageTriage: Special:NewPagesFeed broken on beta cluster testwiki - https://phabricator.wikimedia.org/T349635 (10Soda) >>! In T349635#9277187, @jsn.sherman wrote: >>>! In T349635#9276879, @Soda wrote: >> Based on a look at `/wmf-config/Initia...