[05:52:27] klausman: o/ [05:52:27] I've pushed a patch to fix the `InfServiceHighMemoryUsage` alerts fired by the article-descriptions model server: [05:52:28] https://gerrit.wikimedia.org/r/1006870 [05:52:28] please review whenever you get a minute. thanks! [06:37:27] (03PS7) 10MPGuy2824: Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) [06:39:12] (03CR) 10CI reject: [V: 04-1] Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) (owner: 10MPGuy2824) [06:42:30] (03PS8) 10MPGuy2824: Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) [06:44:34] (03CR) 10CI reject: [V: 04-1] Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) (owner: 10MPGuy2824) [06:50:47] (03PS9) 10MPGuy2824: Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) [06:52:32] (03CR) 10CI reject: [V: 04-1] Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) (owner: 10MPGuy2824) [06:53:50] (03PS10) 10MPGuy2824: Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) [06:57:32] (03CR) 10MPGuy2824: Replace addQuotes() with expression builder (032 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) (owner: 10MPGuy2824) [09:24:55] Morning! [09:24:59] kevinbazira: +1'd [09:26:50] thanks klausman. going to merge and deploy ... [10:17:50] (03CR) 10Ladsgroup: [C: 03+2] Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) (owner: 10MPGuy2824) [10:21:08] (03Merged) 10jenkins-bot: Replace addQuotes() with expression builder [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1004348 (https://phabricator.wikimedia.org/T350986) (owner: 10MPGuy2824) [11:09:14] 06Machine-Learning-Team, 13Patch-For-Review: Move the article-descriptions model server from staging to production - https://phabricator.wikimedia.org/T358467#9582988 (10kevinbazira) The article-descriptions model server was firing `InfServiceHighMemoryUsage` alerts. This usually happens when an isvc uses >90%... [11:25:57] 06Machine-Learning-Team: Create external endpoint for article-descriptions isvc hosted on LiftWing - https://phabricator.wikimedia.org/T358654 (10kevinbazira) [11:31:03] * klausman lunch [11:33:54] 06Machine-Learning-Team: Set SLO for the article-descriptions isvc hosted on LiftWing - https://phabricator.wikimedia.org/T358655 (10kevinbazira) [12:14:56] 06Machine-Learning-Team: Create external endpoint for article-descriptions isvc hosted on LiftWing - https://phabricator.wikimedia.org/T358654#9583178 (10klausman) a:03klausman [12:40:12] hello folks :) [13:44:29] 06Machine-Learning-Team, 07Epic: Epic: Implement prototype inference service that uses Cassandra for request caching - https://phabricator.wikimedia.org/T356256#9583531 (10klausman) >>! In T356256#9583456, @elukey wrote: > * What is the schema selected for the data stored in Cassandra? We should document it in... [13:44:37] hey luca, welcome back! [13:44:50] thanks! [13:45:48] I've tried to answer your questions re: the Cass code, but I think a proper overview is probably best done via VC [13:48:46] nono it is fine, I wanted to get the idea.. for the schema I am also curious to know the primary/partition key, and the replication scheme (no problem if we didn't have them now, but it is good to keep them in mind since it will matter when we'll eval performances) [13:49:14] The primary/part key I used so far is lang/revid/model_version [13:49:35] And since I used only my home server for Cass, the replication factor was 1 :) [13:50:35] re: cache nodes - it is fine to have everything hosted in our servers, but IIRC we had some chats with Data Persistence since SRE tried in the past to consolidate all Cassandra workloads. If we manage to experiment and then reach a common ground that they handle, it is better long term (namely we don't need to care about Cassandra etc..) [13:51:36] True. Doing it on our own infra first lets us iterate faster, but longterm, "not having to care" is probably preferable (and it frees up the Hw to use them for something like Feast) [13:51:48] exactly yes [13:52:09] nothing really urgent, but let's keep it in mind [13:52:29] Good morning ! [13:52:34] Elukey! [13:52:39] o/ [13:53:49] Welcome back! [13:55:32] thanks! [13:56:13] What are we talking about? [13:56:35] Using someone else’s Cassandra nodes? [13:58:55] As a longer-term thing, yes [13:59:10] It'd be one less system to maintain, once we have figured out how to bes use it [14:02:09] IIRC the Data Persistence team (SRE) consolidated all cassandra usages in clusters that they manage, including AQS that moved from analytics/DE specific to general purpose (kind-of). At the time they were not very fond of a revision cache, and we didn't really agree on a way to go, but it is worth to follow up [14:03:46] I see that we are getting MI210 wooww [14:03:50] Okay cool [14:03:54] Yeah we are [14:05:10] I had to strip the lift wing expansion budget to do it. [14:05:51] But data center ops really does not want to order only GPUs and put them in old boxes [14:06:18] So they are all U2 new boxes [14:07:06] sigh [14:07:32] It’s been a long conversation [14:07:44] chrisalbon: I saw https://huggingface.co/allenai/OLMo-7B a while ago, was there anybody from Allen AI in Bellagio? [14:07:53] seems really interesting [14:08:14] No, but I agree I want to see what it can do [14:09:01] on paper it looks really amazing [14:10:00] The community would love a truly open source llm [14:11:28] yes definitely [14:12:48] welcome back Luca! <3 [14:12:55] hello Aikoooo / [14:12:56] o/ [14:12:57] :) [14:33:26] 06Machine-Learning-Team, 13Patch-For-Review: Update to KServe 0.11 - https://phabricator.wikimedia.org/T337213#9583791 (10elukey) a:05klausman→03elukey [14:34:13] 06Machine-Learning-Team, 13Patch-For-Review: Update to KServe 0.11 - https://phabricator.wikimedia.org/T337213#9583792 (10elukey) Taking over from Tobias to free some tasks from his queue since I am just getting back to work (need something technical to do :D) [15:03:55] heads up: I'll be draining ml-serv2006 for some super short network stuff on the hour (in ~ an hour, that is), downtime shoudl be only a few minutes [15:12:04] 06Machine-Learning-Team, 06Structured-Data-Backlog: Host a logo detection model for Commons images - https://phabricator.wikimedia.org/T358676 (10mfossati) [16:05:02] folks I am going to deploy the kserve 0.11.2 control plane in staging [16:05:59] :+1: [16:06:17] thanks elukey [16:11:17] of course I missed https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1007391 [16:11:24] ml-serve2006 is back in service [16:11:44] super [16:11:50] +1'd the docker image change [16:11:54] <3 [16:12:05] waiting for CI and then I'll deploy [16:22:25] ah lovely [16:22:26] /usr/bin/manager: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/bin/manager) [16:23:31] ah I see, we build on golang1.21 that is bookworm [16:25:03] kubernetes still doesn't like me [16:26:36] That is how k8s welcomes you back [16:27:01] definitely [17:01:33] golan1.21 is my bad, I just bumped ahead because golang 1.x code will work on any golang 1.y wherne y>=x, but I completely foregot the glibc dep [17:01:50] I think the new kserve needs 1.20 or 1.19. [17:02:27] alternative is to build it with 1.21 on bullseye [17:03:07] There probably isn't a deb, but a golang toolchain can be installed realtively easily. If you want help with that, lmk. [17:04:16] yes yes I recall, I filed https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/1007400/ but I need to test it [17:04:16] https://go.dev/doc/install has instructions that should work if we want to do that [17:04:46] basically all bumped to bookworm anyway, plus CGO_ENABLED=0 [17:04:51] does it make sense? [17:05:09] ah, that is the other option, if none of the kserve stuff links to C code. [17:05:16] So yes. [17:06:22] Mh. Actually I am not 100% sure about the CGO env var disabling the glibc linkage, but I think it does [17:06:22] in theory no, but shouldn't it force to have glib statically linked? [17:06:30] yes yes it should [17:07:16] May bite us if they ever get a C library dep, but we can cross that bridge then [17:07:49] yes yes [17:18:25] elukey: I know 1007400 is not in review yet, but I'm gonna +1 it since I need to run an errand. bbiab :) [17:26:28] thanks! Will test it tomorrow in a better way, not sure about pipx [17:27:51] going afk as well, o/ [17:53:21] have a nice evening folks o/ [18:14:48] (InfServiceHighMemoryUsage) firing: (2) High Memory usage detected in Inference Service - https://wikitech.wikimedia.org/w/index.php?title=Machine_Learning/LiftWing/Alerts#Inference_Services_High_Memory_Usage_-_InfServiceHighMemoryUsage_alert - https://alerts.wikimedia.org/?q=alertname%3DInfServiceHighMemoryUsage [19:45:04] 06Machine-Learning-Team, 06Wikipedia-Android-App-Backlog: Investigate increased preprocessing latencies on LW of article-descriptions model - https://phabricator.wikimedia.org/T358195#9585334 (10Isaac) > Can we investigation reducing the computational need to just the language requested? The model definitely b... [22:14:48] (InfServiceHighMemoryUsage) firing: (2) High Memory usage detected in Inference Service - https://wikitech.wikimedia.org/w/index.php?title=Machine_Learning/LiftWing/Alerts#Inference_Services_High_Memory_Usage_-_InfServiceHighMemoryUsage_alert - https://alerts.wikimedia.org/?q=alertname%3DInfServiceHighMemoryUsage