[05:57:04] (03CR) 10Kevin Bazira: [C: 03+1] revert-risk: add revert-risk wikidata model server [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/917875 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [07:20:06] (03CR) 10Hashar: "recheck after CI config deployment" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/917875 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [08:19:45] (03CR) 10Elukey: [C: 03+1] revert-risk: add revert-risk wikidata model server (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/917875 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [08:27:37] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk multilingual model from staging to production - https://phabricator.wikimedia.org/T333124 (10elukey) I think that we could create a new `revertrisk` generic kubernetes namespace and deploy Revert risk model servers / isvcs to it, what do you think? @klaus... [08:27:42] hello folks :) [08:28:26] klausman: o/ pinged you in T333124, let me know if you have time during the next days to sync with Aiko about what to do etc.. [08:36:06] elukey: ack, looking [08:36:27] elukey: did you see the disk space warnings for the orespoolcounter machines? 2003, 1004 and 1003 [08:37:22] nope interesting, I think I don't have the hilight for orespoolcounter [08:37:27] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk multilingual model from staging to production - https://phabricator.wikimedia.org/T333124 (10klausman) >>! In T333124#8843788, @elukey wrote: > @klausman do you have time to work with Aiko to push this to production during then next days? Yep, can do! I a... [08:37:37] elukey: I have this pinned in a tab https://alerts.wikimedia.org/?q=%40state%3Dactive&q=instance%3D~%28%5Eml%7C%5Eores%29 [08:37:50] nice :) [08:38:16] it even autorefreshes all by itself (dots at the top) [08:38:29] wow 7G of root partition [08:38:33] /o\ [08:38:52] apt-get clean seems to work, doing it on all nodes [08:39:07] I remember installing systems with 1G disks and thinking "plenty of space for /, /usr and /home" [08:39:24] But modern times happened :D [08:39:54] Hm. Should we (as in WMF SRE) consider auto-running apt clean once a day or so? [08:42:31] In general it doesn't give a big relief, since we have way more space on root partitions (40+GBs at least), but it could be an idea.. [08:42:56] once a day may be too much though, once a month could be good as well [08:43:11] not sure if there are downsides, like people relying on old debs etc.. [08:43:35] (it is handy sometimes to find them in emergency situations, IIRC apt-get clean deletes them) [08:46:28] (03CR) 10Klausman: [C: 03+1] revert-risk: add revert-risk wikidata model server (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/917875 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [08:48:36] I agree that the "accidental backup" scenario is not unrealistic, but it also indicates that maybe a central backup of every package would be useful. For Debian packages it already exists (snapshot.debian.org), and for our own packages, it _should_ exist (right?) [09:15:41] maybe yes, but again the practical purposes to justify and maintaining it may not be a lot (it was probably discussed in the past) [09:15:56] ETOOMANYTHINGS :) [09:20:42] It clearly doesn't break things often enough to be a priority :) [09:39:29] exactly yes :) [09:41:19] 10Machine-Learning-Team, 10Epic: Add meaningful access logs to KServe's pods - https://phabricator.wikimedia.org/T333804 (10elukey) [09:41:37] 10Machine-Learning-Team, 10Epic: Add meaningful access logs to KServe's pods - https://phabricator.wikimedia.org/T333804 (10elukey) https://github.com/kserve/kserve/pull/2782 got merged, it will be released in a few weeks with KServe 0.11 [09:49:31] (03CR) 10AikoChou: [C: 03+2] revert-risk: add revert-risk wikidata model server (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/917875 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [09:56:12] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk multilingual model from staging to production - https://phabricator.wikimedia.org/T333124 (10achou) a:03achou [09:56:52] (03Merged) 10jenkins-bot: revert-risk: add revert-risk wikidata model server [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/917875 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [10:10:01] 10Machine-Learning-Team, 10Epic: Add meaningful access logs to KServe's pods - https://phabricator.wikimedia.org/T333804 (10elukey) From https://github.com/benoitc/gunicorn/issues/2457 I created: ` '{"remote_address": "%(h)s", "user_name": "%(u)s", "date": "%(t)s", "status": "%(s)s", "method": "%(m)s", "url_p... [10:15:41] 10Machine-Learning-Team, 10Epic: Add meaningful access logs to KServe's pods - https://phabricator.wikimedia.org/T333804 (10elukey) Found a corner case (not a big blocker): https://github.com/Kludex/asgi-logger/issues/40 [10:16:05] going afk for lunch in a bit [10:17:52] 10Lift-Wing, 10Machine-Learning-Team: Move Revert-risk multilingual model from staging to production - https://phabricator.wikimedia.org/T333124 (10achou) Yess, let's create a new `revertrisk` generic Kubernetes namespace and deploy the model to it! @klausman, please let me know when you finish adding the new... [10:30:06] TIL -> https://wikitech.wikimedia.org/wiki/Scap [10:56:56] (03CR) 10Ilias Sarantopoulos: feat: use Lift Wing instead of ORES (031 comment) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/910439 (https://phabricator.wikimedia.org/T319170) (owner: 10Ilias Sarantopoulos) [12:29:44] 10Machine-Learning-Team, 10Add-Link, 10Growth-Team (Current Sprint), 10User-notice: Deploy "add a link" to 9th round of wikis - https://phabricator.wikimedia.org/T308134 (10Sgs) >>! In T308134#8841112, @Trizek-WMF wrote: > Let's go then with `gor` + all round 9 (except `jbo` and `ik`). Can we deploy next W... [14:16:22] Morning all [14:17:08] hello! [14:31:09] hey! [14:42:33] 10Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10Patch-For-Review: Move backend of ORES MediaWiki extension to Lift Wing - https://phabricator.wikimedia.org/T319170 (10isarantopoulos) Did some research how we can test the changes and there are two options: - 1. Deployment-prep beta cluster. There... [15:29:56] (03PS2) 10AikoChou: revert-risk: fix session host for the wikidata model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) [16:01:46] (03CR) 10Ilias Sarantopoulos: revert-risk: fix session host for the wikidata model (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/919153 (https://phabricator.wikimedia.org/T333125) (owner: 10AikoChou) [16:13:54] 10Machine-Learning-Team, 10MediaWiki-extensions-ORES, 10Patch-For-Review: Move backend of ORES MediaWiki extension to Lift Wing - https://phabricator.wikimedia.org/T319170 (10isarantopoulos) Current status of patchdemo for ORES {F36992433} [16:18:05] * elukey afk! [23:54:56] 10Machine-Learning-Team, 10Add-Link, 10Growth-Team (Current Sprint), 10User-notice: Deploy "add a link" to 8th round of wikis - https://phabricator.wikimedia.org/T308133 (10Etonkovidova) 05Open→03Resolved Checked `guwiki`, `galwiki`, `gotwiki`, and `fjwiki` - "add a link" works as expected ( a new acco...