[08:09:01] good morning folks! [08:09:20] I started https://github.com/kserve/kserve/pull/2782 with upstream to get access logs in our pods, it is not very simple sigh [08:23:07] good morning! [08:24:03] Nice work Luca! [08:24:10] I will review it as well [08:24:31] kalimera! :) [08:24:40] thanks a lot, it is the only way I found [08:24:50] asgi is still a little painful for these kind of things [10:30:18] * elukey lunch! [11:32:13] fyi: wiki-gpt is not working cause I'm currently upgrading it [11:32:38] for some reason when I access the app via ssh + toolforge the system seems read-only.. [11:34:10] sry not access. I mean I cant' update. also webhook fails to push new code for the same reason [11:42:48] same happens for all tools I try to access from toolforge. asked in wikimedia-cloud (just writing here for context) [11:44:28] nevermind, seems I just bumped into toolforge maintainance window. all good [11:45:30] elukey: how may I help? The issue with kserve is probably going to take some time with the PR (I mean reviews etc). [11:47:51] we could add python 3.7 support for asgi-logger but it doesnt make sense since it reaches EOL soon. So we could remove 3.7 support from AIX [12:45:24] * isaranto l8 lunch [13:22:11] Good morning! [13:26:03] morning Chris! [13:32:35] morning! [13:33:16] isaranto: not sure what is best, IIUC the main issue is that AIX needs an ancient version of tensorflow, maybe I can open a gh issue to them explaining that 3.7 is EOLed [13:33:20] good point, doing it :) [13:33:25] does the rest look sound [13:33:26] ? [13:34:15] (there is no activity since nov 2022, sigh) [13:35:55] https://github.com/Trusted-AI/AIX360/issues/172 [13:46:03] the rest looks good [13:46:29] ack thanks a lot for the review [13:46:32] I was trying to see if there was any other way to do it but it sucks that logging is so hardcoded [13:46:53] I'll take a closer look at the review as well [13:47:13] I made a summary in https://github.com/kserve/kserve/pull/2782#issuecomment-1493972637 [13:47:38] https://github.com/encode/uvicorn/pull/947#issuecomment-1283006367 is not great, and IIUC asgi-logger is not that configurable [13:47:52] but in theory logging to stdout should be fine [13:48:42] it should be captured by rsyslog rules for pods, and shipped to logstash [13:50:57] yeah lets w8 for the review. one thing that is not nice/ideal is the addition of an external dependency to such an integral part (asgi-logger) [13:51:36] ofc I understand the reasoning. just noting [13:52:19] oh yes I don't like it either, but it seems the only way.. at least until the uvicorn folks will do something [13:52:47] I don't like the fact that the Kserve folks shipped 0.10 with a broken logging config [13:53:12] 10Machine-Learning-Team, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 9 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Vgutierrez) [14:02:50] isaranto: Joe is on holiday for a couple of weeks, so I think we can proceed manually with ores-legacy [14:02:54] removing what we don't need [14:05:08] then we can procceed with the other patch -> https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/904178 [14:06:11] I need some help if the ingress section is correct (in the values.yaml) [14:08:51] isaranto: not sure if it is the right way, it is very different from the _scaffold dir example in the deployment-charts repo [14:09:23] we could use what sextant produced, removing the php-fpm bits etc.. [14:24:30] cool [14:24:52] I'm checking the output of helmfile of both ways in order to compare [14:25:45] the truth is if we go with sextant we should use a new model e..g python-web-app/service [14:27:21] all charts have been migrated to the new format IIRC, that is the one with modules [14:28:13] the reference should be the _scaffold dir in theory [14:28:19] as long as we follow that one we should be good [14:31:57] this is what I produce with `helmfile tempalte -e ml-staging-codfw` using the sextant patch [14:34:40] template instead of tempalte --^ [14:35:48] afaiu the php-fm files etc (files under templates/vendor) are part of the model. I could remove them since we def dont need them. same goes for any lamp stack related templates [14:41:21] yes yes but it is due to a new convention for building helm charts [14:41:23] lemme find the task [14:42:34] ok understood [14:42:35] https://phabricator.wikimedia.org/T292818 [14:42:59] my point is that if we add a chart with an old format SRE may not like us a lot :D [14:44:00] yes I remember reading this task [14:44:12] yes, lets proceed with the new one with sextant then [14:44:25] super, thanks for the patience [14:44:34] I promise that I'll try to be quick in reviewing the patches :D [14:44:56] no worries [14:45:13] I'll just remove everything under template/vendor/lamp then since it will not be used [14:49:36] yep yep [14:49:47] if you want we can use php for ores legaby [14:49:50] *legacy [14:50:12] :D [14:53:00] ok it is ready! https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/904777 [14:53:31] sure :D or we could just build liftwing-legacy to have it ready for the future :D [14:53:59] yes this is material for the next planning [15:04:27] ok, I used sextant to update dependencies after removing lamp configs and did a helm install on minikube and still works fine [15:27:41] elukey: I had just done these changes before u reviewed [15:29:34] super [15:31:33] isaranto: do we use any Secret? [15:31:42] I am cheking the helmfile.yaml config [15:31:58] 👀 nope [15:32:04] this is bad copy pasting... [15:32:21] ah okok perfect, I was wondering if we needed any, we can remove it from there [15:36:35] shall I leave it in the helm chart? [15:36:37] isaranto: one thing that could be better in terms of reviews from serviceops as well.. could you please split the helmfile configs in another code change? So we have a single chart to review, and possibly if we could have a .fixtures directory with a simple test config to use [15:36:49] isaranto: yeah we can leave it in there [15:37:51] elukey: sure will split it [15:41:14] shall I put the files of ores-legacy in the fixtures or some dummy ones? [15:41:57] isaranto: anything is fine, even unicorn-service :) [15:42:09] So we'll see what CI tells us etc.. [15:42:13] ack [15:42:41] unicorn-service is a keeper [15:42:52] nice wordplay as well [15:43:46] I love it [15:46:43] ok, I split the patch. will w8 for the patch with the chart to be reviewed and will then issue a new one with helmfile etc [16:42:27] logging off folks, cu tomorrow! [16:43:03] o/ [16:59:50] 10Machine-Learning-Team: Host open source LLM (bloom, etc.) on Lift Wing - https://phabricator.wikimedia.org/T333861 (10calbon) [17:28:23] * elukey afk! [17:28:28] have a nice rest of the day folks [21:12:16] 10Machine-Learning-Team, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 9 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10bking) [21:45:51] wiki gpt is up and running again!