[07:39:15] (03PS1) 10AikoChou: outlink: allow accessing MediaWiki API through internal endpoint [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) [08:04:52] (03CR) 10Elukey: outlink: allow accessing MediaWiki API through internal endpoint (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:05:40] good morning folks :) [08:17:13] (03CR) 10Elukey: outlink: allow accessing MediaWiki API through internal endpoint (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:24:02] good morning o/ [08:24:33] thanks for the merge elukey, I am going to deploy to prod codfw. [08:25:54] super [08:28:55] codfw prod deployments have been completed successfully. [08:28:55] checking pods now ... [08:31:45] all new pods are up and running. [08:31:45] NAME READY STATUS RESTARTS AGE [08:31:45] kowiki-articletopic-predictor-default-9fkj8-deployment-547q9l6c 3/3 Running 0 2m25s [08:31:45] srwiki-articletopic-predictor-default-nfsrt-deployment-57d6kpnf 3/3 Running 0 2m23s [08:31:45] ukwiki-articletopic-predictor-default-hdptv-deployment-c846v8qx 3/3 Running 0 2m22s [08:32:56] very nice [08:33:10] I think that we can safely complete kserve's rollout [08:33:44] (03CR) 10AikoChou: outlink: allow accessing MediaWiki API through internal endpoint (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:36:29] (03CR) 10Elukey: outlink: allow accessing MediaWiki API through internal endpoint (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:44:01] (03CR) 10AikoChou: outlink: allow accessing MediaWiki API through internal endpoint (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:53:09] (03CR) 10Elukey: [C: 03+1] outlink: allow accessing MediaWiki API through internal endpoint [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:55:10] (03PS2) 10AikoChou: outlink: allow accessing MediaWiki API through internal endpoint [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) [08:55:56] (03CR) 10AikoChou: outlink: allow accessing MediaWiki API through internal endpoint (032 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [08:57:41] elukey: o/ thanks for reviewing the change. I updated the patch :) [08:59:16] aiko: o/ good to merge! [09:05:56] 10Machine-Learning-Team, 10ORES, 10MediaWiki-Core-Preferences, 10Moderator-Tools-Team: 'Highlight likely problem edits' preference doesn't work in mobile web - https://phabricator.wikimedia.org/T314026 (10Samwalton9) [09:26:06] (03PS1) 10Elukey: [WIP] editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [09:26:16] aiko: --^ first draft of what we discussed [09:26:53] ah no ok it is clearly wrong, "s" needs to be AsyncSession [09:27:01] I'll fix :) [09:28:55] (03PS2) 10Elukey: [WIP] editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [09:31:55] -- [09:32:06] moving the first docker images of eqiad to kserve 0.8 [09:33:32] (03CR) 10CI reject: [V: 04-1] [WIP] editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [09:43:48] (03CR) 10AikoChou: [C: 03+1] outlink: allow accessing MediaWiki API through internal endpoint [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [09:50:09] The conflict is caused by: The user requested mwapi==0.6.1 revscoring 2.11.5 depends on mwapi<0.5.999 and >=0.5.0 [09:50:14] aiko: --^ [09:50:18] * elukey cries in a corner [09:51:10] oh nooooo [09:53:41] https://github.com/wikimedia/revscoring/pull/525 [09:53:43] :) [09:53:49] I'll do another release this afternoon [09:53:51] that dependency hell demon just keeps returning :-/ [09:54:04] Also, morning! [09:54:19] morning! [09:54:39] Approved that GH change for 2.11.6 just now [09:54:57] thanks! [09:55:14] elukey: can you give +1 again to the outlink patch? [09:56:18] aiko: yes yes feel free to merge! [09:56:36] the +1 was "go ahead when the comment is added etc.. :) [09:57:23] elukey: I can only give +1, so I can't merge it. [09:59:41] aiko: ah no this is not good [09:59:51] (03CR) 10Elukey: [C: 03+2] outlink: allow accessing MediaWiki API through internal endpoint [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818052 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [10:00:09] aiko: merged, I think you are missing some access perms [10:00:18] me or Tobias will sort it out :) [10:01:20] nice! thanks :) [10:01:40] going afk for errands + lunch! [10:01:42] ttl [10:03:11] Ah, aiko is not a member of machine-learning-wmf in gerrit. Will fix (if I can) [10:03:24] And done. [10:26:41] klausman: thank you Tobias :) I can give +2 in inference-services repo now! [10:26:48] Excellent. [10:32:19] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Upload outlink topic model to storage - https://phabricator.wikimedia.org/T313887 (10achou) [10:32:21] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Deploy Outlinks topic model to production - https://phabricator.wikimedia.org/T287056 (10achou) [10:33:10] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Deploy Outlinks topic model to production - https://phabricator.wikimedia.org/T287056 (10achou) [10:33:12] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Create outlink topic model inference service - https://phabricator.wikimedia.org/T313888 (10achou) [10:34:23] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Add a new helmfile config for outlinks topic model - https://phabricator.wikimedia.org/T307895 (10achou) [10:35:16] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Create outlink topic model inference service - https://phabricator.wikimedia.org/T313888 (10achou) [10:38:05] aiko: I only +1'd the isvc change (818076) for now, since I don't know whether you want Luca, Chris or Kevin to also have a look at them. If my +1 is enough, I can =2 it and have it automerged [10:39:41] klausman: yes please [10:39:48] ok, sec [10:40:13] Now waiting for Jenkins to do its thing [10:44:03] And merged. [10:47:03] thanks :) I'm gonna deploy it again to staging (fingers crossed) [11:01:53] nice, it works \o/ [11:06:20] Excellent! [11:06:28] Then I shall go seek lunch and some groceries :D [12:20:38] nice aiko :) [12:28:20] https://pypi.org/project/revscoring/2.11.6/ [12:28:23] new revscoring published :) [12:41:17] I wonder how some revscoring users see these recently-frequent updates :) [12:43:42] I doubt anybody notices :) [12:44:14] Ever the optimist :) [12:47:03] we are also adding features and fixing things, so probably people are happy [12:56:46] (03PS3) 10Elukey: [WIP] editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [12:58:21] aiko: I am checking perms for inference-services, and people in 'machine-learning-wmf' can merge [12:58:24] https://gerrit.wikimedia.org/r/admin/groups/cddbf2315647ba438f5741826fffaeedfdcdfe8a,members [12:58:34] you are in there, so you can definitely +2 in theory [12:58:40] what is the error that you see? [12:58:46] Yes, because I added her earlier :) [12:59:21] I fixed it just after you went to lunch [13:00:23] ahhh snap I lost that part in the backscroll, super [13:01:13] klausman: can you add it to https://wikitech.wikimedia.org/wiki/Machine_Learning/Onboarding ? [13:01:16] so we don't forget [13:01:24] ack [13:05:31] Kept it simple since it's likely going to change as things evovle and/or move to Gitlab [13:06:31] super [13:09:22] elukey: thanks for the merge. I'm going to deploy outlink model to codfw [13:09:37] aiko: doing it now, I am upgrading eqiad :) [13:10:10] aiko: you can check the eqiad endpoint, all pods are up [13:10:18] codfw in progress [13:11:37] great!! [13:13:20] pods are up in codfw as well :) [13:14:08] I am completing the upgrade to kserve 0.8 in eqiad [13:14:55] done! Kserve upgraded to 0.8 :) [13:32:50] That was a lot more painless than I had feared. [13:32:59] Must be Luca's magic touch <3 [13:33:42] (03PS4) 10Elukey: editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [13:34:31] (03CR) 10Elukey: "Tested locally with Docker, all good. I've set debug output to force revscoring to emit urllib calls, and there were none (hence the api e" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [13:35:56] (03PS5) 10Elukey: editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [13:38:12] ok ready for a review --^ :) [13:40:27] (taking a break) [13:46:22] taking a look :) [13:52:33] (03CR) 10Klausman: editquality - add MWAPICache to preprocess (032 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [14:06:12] Morning all! [14:06:45] I am always sad I miss all the chatting that happens, then I realize it started at 0:12 my time [14:08:46] Maybe we could code an IRC bot that records everything and then when you join, we can have it replay things in realtime. :) Wouldn't be very interactive, tho :) [14:16:15] Ha, I could just reply but then nobody would read it [14:18:02] IRC also doesn't really have threading or notification for thread replies, Slack works better that way [14:36:06] (03PS6) 10Elukey: editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [14:36:09] (03CR) 10Elukey: editquality - add MWAPICache to preprocess (032 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [14:39:56] (03PS7) 10Elukey: editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [14:43:54] kserve 0.9 is already out https://github.com/kserve/kserve/releases/tag/v0.9.0 [14:44:13] the new release was cut during these days [14:44:19] we can wait some weeks and upgrade as well [14:44:26] (0.10 seems to be already in the making) [14:49:03] aiko: qq - from your tests with Ray workers, were you be able to run predict() only to a separate worker, or was it always the whole model class? [14:49:39] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Move revscoring isvcs to async architecture - https://phabricator.wikimedia.org/T313915 (10elukey) a:03elukey [14:55:25] (03CR) 10CI reject: [V: 04-1] editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [14:57:02] (03CR) 10AikoChou: "Looks good. There is only a small mistake :)" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [15:03:31] elukey: I think it is always for the whole model class. [15:19:59] super random question [15:20:41] If a team brought us a model that worked with Lift Wing, how long would it take to deploy it? [15:20:53] Assuming we worked on it right away and there wasnt any problems [15:31:28] (03PS8) 10Elukey: editquality - add MWAPICache to preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) [15:31:44] (03CR) 10Elukey: editquality - add MWAPICache to preprocess (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/818067 (https://phabricator.wikimedia.org/T313915) (owner: 10Elukey) [15:32:39] chrisalbon: the "k8s paperwork" is not big, a couple of hours probably (assuming that code doesn't need to be changed etc..) [15:33:01] Awesome thanks! [15:33:04] there is also the import in our inference-service repository, CI automation to build the docker images etc.. [15:33:11] so it may take a little more, but not too much [15:33:53] how long do you think to tweak a model [15:34:00] like, its already in production [15:34:08] but the person changed the model slightly [15:34:14] and wants us to redeploy [15:34:45] it should be basically a docker image update, so a deployment and that's it [15:35:08] so like 5 minutes [15:35:50] no no a little more, with the docker image update etc.. I'd say one hour [15:36:14] okay thanks [15:37:31] np :) [15:41:53] going afk folks, tomorrow I'll be off so have a good weekend all :) [19:22:17] 10Machine-Learning-Team, 10SRE, 10ops-codfw: codfw: ml-serve2001 memmory issue DIMM A2 - https://phabricator.wikimedia.org/T313822 (10Papaul) 05Open→03Resolved a:03Papaul