[07:01:06] o/ [08:56:45] I was checking eswiki damaging alert. Same thing... [08:57:41] I'm thinking to take Luca's suggestion and increase min replicas to 2 for the services that have issues and discuss tackling the issue as well [08:57:56] * isaranto early lunch and errand [09:26:48] Morning! [09:44:45] * klausman lunch as well [09:52:32] finally have a version for handling redirects in mwapi.. https://github.com/AikoChou/python-mwapi/commit/cdf6eabc99c2e2d136ef54514f65ec7353e95a35 [09:52:53] add **request_params in case in the future we want to pass other params to the request [10:02:46] thinking to test it in revscoring models first because it uses session.get directly. revertrisk model makes request via knowledge integrity.. [10:24:27] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1038736 [10:24:33] ---^ if anyone has time [10:39:29] o/ aiko! I'll review and test it! [10:41:04] isaranto: +1 increase min replicas to 2 for now. the errors happen quite often recently.. and we could discuss the issue today in the meeting [10:47:04] isaranto: ohh regarding the redirect patch, I need to first test it and I'll open a MR afterwards and ask for your review :) [10:47:21] ok! [11:18:04] * aiko lunch! [11:22:41] klausman: FYI, there's a new round of kernel reboots, list of ML hosts is at https://phabricator.wikimedia.org/T366555 (all kernels/microcode updated, fleet-wide, "only" needs the reboots) [11:38:54] Good morning all [11:41:21] moritzm: roger! [11:41:36] hey Chris! [11:42:24] o/ Chris [12:24:15] aiko: o/ re: python-mwapi - another option is to avoid "force_http" in mwapi, and just set allow_redirects=False. Then the logic of retrying etc.. would be stored in inference-services' repo, to separate concern. Both approaches are good, but if you add "force_http" in mw-api be sure to add docs about it since without context the parameter may be confusing (like people asking Why do we need to force [12:24:21] http?) [12:27:59] isaranto: re eswiki - one thing that we could do is to set revscoring-mp for problematic model servers (like eswiki, viwiki) and offload preprocess() to a separate Python process. We could also think about doing it selectively, when "size" returned by the MWAPI is above a certain thresold (so we don't have to pay the serialization/deserialization penalty for "quick" rev-ids) [12:32:19] elukey: ack, I can do that in staging and test results [12:32:38] I mean test with current problematic rev id [12:33:36] makes sense yes, but the key I think it is doing it selectively, otherwise for regular workload we'll pay a lot of latency [12:40:23] I was thinking to start with viwiki [12:49:19] this is what I mean https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1038765 [12:51:12] (03PS1) 10Elukey: revscoring_model: inspect mw-api-cache for MP preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1038766 (https://phabricator.wikimedia.org/T363336) [12:51:30] isaranto: --^ this is an extra bit that I meant aboce [12:51:33] *above [12:51:37] needs to be tested of course [12:53:08] the main issue with PREPROCESS_MP (that we observed in the past) was that serialization/deserialization took more than preprocess() and process() sometimes [12:53:36] what's the serialization format? pickle? Or JSON? [12:53:39] so we could offload preprocess selectively, to allow quick rev-ids to be executed in the main loop [12:53:42] pickle [12:54:05] process to process message passing basically [12:54:11] Hm, unlikely to find something much faster unless we'd go superfancy, like protobuf [12:56:14] elukey: okk, now I got it. this is much better [12:56:41] isaranto: does it make sense? I am not 100% sure that "size" can be accessed in that way, but it should be available [12:57:12] we could tune the MWAPI_REVID_CONTENT_THRESHOLD_BYTES based on the use cases [12:57:50] klausman: ray does something like that, IIRC kserve supports it natively (sort of), but when Aiko checked IIRC it needed a separate server process to handle all the heavy lifting [12:58:03] not sure if today things are better [12:58:28] we use python's built-in message passing rather than ray [13:03:43] yeah, it's probably worth just the smaller maintenance overhead [13:08:43] it makes sense, I too don't know if size can be accessed that way, but I can check [13:32:51] I'm building the above patch locally to test it [13:35:53] <3 [14:28:21] 06Machine-Learning-Team: Test Revert Risk model with the transparent config - https://phabricator.wikimedia.org/T366250#9859905 (10isarantopoulos) a:03achou [14:47:36] 06Machine-Learning-Team, 13Patch-For-Review: Add Istio (and related) config to allow LW isvcs to talk to ML Cassandra machines - https://phabricator.wikimedia.org/T360428#9859995 (10klausman) 05Open→03Resolved [14:56:15] elukey: yeah it's a bit weird to have force_http in mwapi, probably we are the only ones who need it lol. I'll look into the option you mentioned! [15:05:17] aiko: nono the solution is good as well, it encapsulate all the logic in there, maybe we could check if the logic is needed only in mw-api or elsewhere too [15:05:35] if so, it may be better to have a separate util in inference-services [15:05:41] so we'll reuse/dry more code etc.. [15:14:52] ok sounds good :) [15:20:44] 06Machine-Learning-Team, 05Goal: 2024 Q4: Users can "pip install liftwing" and access 20% of models - https://phabricator.wikimedia.org/T359140#9860217 (10isarantopoulos) We have added request payload validation with pydantic and currently adding more models to the package. [15:23:31] aiko: this is the error I was getting in HF for Mistral https://phabricator.wikimedia.org/T365246#9835940 [15:26:43] I haven't had time to look into but imo there is not much to debug on our side since we are using third party code. So I would just look if a) the new mistral version works b)if there is a fix in kserve or transformers package (or fastapi) which would be solved by an update [15:26:55] for now I'm just focusing on using a 7B model that would work [15:27:00] out of the box I mean [15:58:17] isaranto: ack! thank uuu [16:22:33] elukey: I tested the patch you provided earlier and I made it work with some modification so that we can access the "size" field properly [16:22:38] this is the change https://phabricator.wikimedia.org/P64026 [16:22:47] shall I modify the patch directly? [16:22:58] ah nice yes please! [16:35:56] (03PS2) 10Ilias Sarantopoulos: revscoring_model: inspect mw-api-cache for MP preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1038766 (https://phabricator.wikimedia.org/T363336) (owner: 10Elukey) [16:36:16] (03PS3) 10Ilias Sarantopoulos: revscoring_model: inspect mw-api-cache for MP preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1038766 (https://phabricator.wikimedia.org/T363336) (owner: 10Elukey) [16:36:56] (03CR) 10CI reject: [V:04-1] revscoring_model: inspect mw-api-cache for MP preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1038766 (https://phabricator.wikimedia.org/T363336) (owner: 10Elukey) [16:36:59] I submitted the above --^ but need to test it a bit more to make sure no errors fall through the cracks [16:37:48] (03PS4) 10Ilias Sarantopoulos: revscoring_model: inspect mw-api-cache for MP preprocess [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1038766 (https://phabricator.wikimedia.org/T363336) (owner: 10Elukey) [16:38:01] going afk folks o/ [16:40:07] o/ [16:40:19] lemme know if you want to brainbounce about testing during the next days [16:48:11] Ok,thank you! [17:07:35] o/ [21:32:09] 06Machine-Learning-Team: Using LiftWing on non wikimedia wikis - https://phabricator.wikimedia.org/T366654 (10Nicolas_NALLET) 03NEW