[05:41:09] Morning! [06:26:12] Need to run an errand be back in a bit [08:01:51] 10Machine-Learning-Team: Increased latencies with Kserve 0.11.1 (cgroups v2) - https://phabricator.wikimedia.org/T349844 (10isarantopoulos) [08:29:38] o/ [08:51:53] I am wondering if I can send a PR to catboost, the xgboost code for cgroups v2 seems to be easy enougj [09:18:22] that would be great I think! lemme know if I may help somehow [10:21:19] it is not super easy, the catboost folks seem to use a completely different set of libs/style for cpp [10:25:04] This is https://github.com/catboost/catboost, right? [10:26:58] If so, the relevant code is probably this function: https://github.com/catboost/catboost/blob/a17aed2b2c01c343cea1bb2a1908ee9e6792efd2/util/system/info.cpp#L83 [10:29:49] I think the way the ifdefs work, is that the lines 85-87 are trumping the branches and 97 and 99-146. Whether the block at 97 or 99-146 are preferable I am not sure. I don't have a good grasp of whether 85-87 is Cgroups2-compatible or not. [10:33:09] klausman: we already have https://github.com/catboost/catboost/issues/2518 [10:33:31] it is not cgroups v2 compatible, xgboost had a similar problem and they fixed it [10:33:39] Ah, I see [10:33:47] my point was that their patch may need to be adapted for catboost [10:33:55] because of different libs/style [10:40:23] I'm such a noob! [10:40:32] * klausman lunch and an errand [10:50:11] aiko: o/ - IIUC we don't have anymore a way to tests sending events from Lift Wing to eventgate in staging right? [10:50:41] from what I can see if we try to send any test event in staging it will generate an event in the same eventgate stream as prod :( [10:51:01] not a big one but we should add a test event stream, wdyt? [10:51:50] * elukey lunch [10:54:46] iirc aiko isn't here today. On the stream topic I agree. Would it have to be in the prod infra or is there a separate staging area for streams? [11:31:59] fixed the memory alert! 🤞 [11:36:22] https://gerrit.wikimedia.org/r/c/operations/alerts/+/963724/ [11:38:18] Now I can have lunch! [11:38:20] * isaranto lunch [12:28:27] elukey: you can send an event to staging, you just need to the staging eventgate instance. they produce to 'staging.' prefixed topics [12:29:11] need to hit* [12:30:42] there's no exteranl route to the pod though, hmm, you'd have to hit the current pod IP? I think? [12:32:16] e.g,. for eventgate-main staging [12:32:17] curl -k 'https://10.64.75.71:4492/_info' [12:51:04] Morning all! [12:51:12] Hey ottomata [12:54:59] ottomata: ah nice TIL! So we'd need an LVS service in front of it? [12:55:08] (we hit the eventgate endpoint from another cluster) [12:58:57] mmm in theory there should be the staging lvs endpoint [12:58:58] lemme check [13:04:58] elukey@deploy2002:~$ curl https://staging.svc.eqiad.wmnet:4492/_info [13:04:59] {"name":"eventgate","version":"1.8.3","description":"Event intake service - POST JSONSchemaed events, validate, and produce.","home":"https://github.com/wikimedia/eventgate"} [13:05:02] ottomata: --^ [13:05:14] it is a simple CNAME but it should work [13:05:36] eventgate is still not using the istio gateway, but for the moment it seems fine [13:05:45] (the ingress module in deployment-charts basically) [13:05:54] ok sending a patch, thanks! [13:06:47] morning Chris! [13:06:54] but I need to follow up with serviceops, without discovery endpoint it may be difficult [13:07:18] curl https://staging.svc.codfw.wmnet:4492/_info hangs sigh [13:08:22] thanks Andrew! TIL for me as well [13:11:06] going to open a task [13:59:31] bbiab, doing some suitcase packing and stuff [13:59:54] Bring shoes you can walk in! [14:01:50] I always do :) [14:16:35] 10Machine-Learning-Team: Increased latencies with Kserve 0.11.1 (cgroups v2) - https://phabricator.wikimedia.org/T349844 (10isarantopoulos) I started doing some prep-work on knowledge_integrity to update xgboost https://gitlab.wikimedia.org/repos/research/knowledge_integrity/-/tree/update-xgboost One thing that... [14:44:10] logging off folks, lots of stuff to do! have a great weekend! [14:52:32] night isaranto [15:43:11] 10Machine-Learning-Team: Apply common settings to publish events from Lift Wing staging to EventGate - https://phabricator.wikimedia.org/T349919 (10elukey) [15:43:33] tried to summarize the events questions/doubts in --^ [15:43:38] isaranto: have a safe flight! [15:43:52] thanks elukey [15:48:03] aiko, klausman have a safe flight! Enjoy the offsite :) [16:01:19] going afk folks, have a nice weekend! [16:58:31] nice elukey! [18:23:21] 10Machine-Learning-Team: Apply common settings to publish events from Lift Wing staging to EventGate - https://phabricator.wikimedia.org/T349919 (10Ottomata) > no discovery endpoint for wikikube staging Perhaps it is possible to make one? [18:24:21] 10Machine-Learning-Team: Apply common settings to publish events from Lift Wing staging to EventGate - https://phabricator.wikimedia.org/T349919 (10Ottomata) Or, we could deploy a new eventgate-staging|eventgate-dev|eventgate-test instance to eqiad and codfw that produces to kafka test or something! [22:11:11] (03PS3) 10Jdlrobson: Don't use live configuration [extensions/ORES] - 10https://gerrit.wikimedia.org/r/957970 (https://phabricator.wikimedia.org/T345922) (owner: 10Jsn.sherman)