[00:17:32] <wikibugs>	 (03CR) 10CI reject: [V:04-1] Remove a duplication of selectors in ext.ores.highlighter [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1052157 (owner: 10Ebrahim)
[06:10:01] <isaranto>	 o/ good morning
[06:42:42] * isaranto afk be back in an hour
[06:58:19] <wikibugs>	 06Machine-Learning-Team, 13Patch-For-Review: Support building and running of articlequality model-server locally - https://phabricator.wikimedia.org/T368875#9955703 (10kevinbazira) 05Open→03Resolved
[06:58:23] <wikibugs>	 06Machine-Learning-Team, 13Patch-For-Review: Support building and running of articlequality model-server locally - https://phabricator.wikimedia.org/T368875#9955706 (10kevinbazira) a:03kevinbazira
[07:38:52] <wikibugs>	 06Machine-Learning-Team: Reorganize LiftWing isvcs repo structure to improve maintainability - https://phabricator.wikimedia.org/T369344 (10kevinbazira) 03NEW
[07:52:08] * isaranto back
[07:55:33] <elukey>	 hi folks!
[07:55:49] <elukey>	 knative images deployed on staging, let's see how they goes
[07:56:49] <elukey>	 a little hiccup that I found - changing the net-istio config-map (removing the example) triggered updates to a lot of pods, and afaics it seems that the net-istio webhook pods were not available to answer TLS calls to validate etc..
[07:57:02] <elukey>	 the first time the deployment failed, the second it succeeded
[07:57:30] <elukey>	 I recall that we had a similar issue with the knative webhook, and IIRC we solved it increasing a little the readiness probe
[08:00:10] <elukey>	 from kubectl describe pod I don't see a readiness probe configured for the net-istio webhook, so maybe it is a default veeery quick and string
[08:00:13] <elukey>	 *strict
[08:05:13] <isaranto>	 hey Luca!
[08:05:55] <isaranto>	 ack! thanks for taking care of that
[08:09:00] <isaranto>	 elukey: is there anything we should do to check? run load tests or sth similar?
[08:09:52] <elukey>	 nono I think this is more related to when we change configmaps that are pushed in more places
[08:10:18] <elukey>	 we can test prod, if the deploy doesn't go through we can add some extra readiness tolerance
[08:10:43] <elukey>	 (Basically if the webhook isn't available soon after the deployment then helm considers it failed etc..)
[08:10:56] <isaranto>	 ok, clear!
[09:17:52] <wikibugs>	 (03CR) 10Matěj Suchánek: [WIP] Add AbuseFilter variable for revertrisk score (033 comments) [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1051837 (https://phabricator.wikimedia.org/T364705) (owner: 10Kosta Harlan)
[10:46:03] * isaranto afk lunch
[11:43:46] <wikibugs>	 (03CR) 10Ladsgroup: [C:03+2] "try again" [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1052157 (owner: 10Ebrahim)
[11:47:12] <wikibugs>	 (03Merged) 10jenkins-bot: Remove a duplication of selectors in ext.ores.highlighter [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1052157 (owner: 10Ebrahim)
[12:32:35] <isaranto>	 I found a way to fix the requirements in the hf image and create less of a mess when upgrading the package versions using a pip constraints file https://pip.pypa.io/en/stable/user_guide/#constraints-files
[12:33:19] <isaranto>	 I'm trying to make it work with blubber at the moment cause there is no native support (not the same way that we define requirements.txt)
[12:40:55] <isaranto>	 oh nevermind it won't work
[13:21:44] <wikibugs>	 06Machine-Learning-Team: Simplify dependencies in hf image - https://phabricator.wikimedia.org/T369359 (10isarantopoulos) 03NEW
[13:23:10] <wikibugs>	 06Machine-Learning-Team: Simplify dependencies in hf image - https://phabricator.wikimedia.org/T369359#9956678 (10isarantopoulos)
[13:34:54] <wikibugs>	 06Machine-Learning-Team: Simplify dependencies in hf image - https://phabricator.wikimedia.org/T369359#9956694 (10isarantopoulos) In a previous [[ https://phabricator.wikimedia.org/T357986#9679664 | iteration ]]  I wrongly thought that this behavior was done because of `torch` and `torch-rocm` having different m...
[13:35:10] <isaranto>	 I figured it out -^, it was much simpler than I thought
[13:46:34] <klausman>	 nice :) what a rollercoaster for a Friday :)
[13:48:12] <klausman>	 PYTHONPATH making life confusing was a constant in a previous job. Fortunately, for unrelated reasons, we switched the whole codebase to Go, which solved that problem
[13:49:56] * isaranto nods
[14:11:38] <isaranto>	 I need some help in the production-images repo (iirc I've had this issue in the past but don't remember what I did to solve it).
[14:12:06] <isaranto>	 when I run `docker-pkg -c config.yaml build images/ no images are built although I haven't built them all
[14:12:37] <isaranto>	 or more specifically the command I remember I was using `docker-pkg -c config.yaml build images/ --select "*pytorch*"`
[14:19:31] <klausman>	 sec, let me access my secondary memeory (.bash)history) :)
[14:21:15] <klausman>	 That command should work, AFAICT. But maybe you need to remove the already-built images in your local repo?
[14:21:31] <klausman>	 IIRC, docker-pkg will only build entirely-absent image versions
[14:22:22] <isaranto>	 I'm trying to build amd-pytorch-common image which doesn't exist
[14:22:43] <klausman>	 let em do a local experiment with current HEAD
[14:22:45] <isaranto>	 thanks for the answer Tobias! I'll just delete my local images to be sure
[14:25:58] <isaranto>	 nooo it doens't work, and now I have to redownload all my docker images :(
[14:26:07] <klausman>	 damn
[14:26:13] <klausman>	 I can't get it to build, either
[14:26:14] <isaranto>	 going to try if there's anything fancy going on with docker-pkg installation
[14:32:34] <elukey>	 isaranto: is the changelog correctly bumped to a new version?
[14:33:58] <isaranto>	 I couldn't bump it cause I hadn't built the amd-pytorch-image https://phabricator.wikimedia.org/P65870
[14:34:50] <isaranto>	 now I don't have any image locally. hmm I was puzzled into how this works
[14:35:02] <isaranto>	 maybe if I just download the amd-pytorch-common image , then update the changelog and retry it would work
[14:35:20] <elukey>	 can you share the full diff?
[14:35:24] <elukey>	 otherwise it is difficult
[14:35:46] <klausman>	 one trick to force a local build is to comment out the docker-registry in config.yaml. But naturally, for an actual update/bump the changelog needs an update and _then_ docker-pkg will DTRT
[14:36:10] <klausman>	 It's just a bit strict about not building images that it sees as already published.
[14:36:44] <isaranto>	 elukey: which diff? the git diff?
[14:37:33] <elukey>	 isaranto: the diff for the production-images repo that you are trying to build
[14:38:40] <elukey>	 from P65870 it seems that you have a change lined up
[14:39:14] <isaranto>	 ack
[14:39:58] <isaranto>	 this is the diff https://phabricator.wikimedia.org/P65870#263864
[14:40:21] <isaranto>	 I managed to solve it by commenting out the docker registry
[14:40:44] <isaranto>	 sorry for the hassle folks! 
[14:41:32] <isaranto>	 I needed the amd-pytorch-common image which was not built ofc because it had no change and for some reason downloading it manually didnt work
[14:41:44] <isaranto>	 thanks both!
[14:42:30] <elukey>	 ah interesting trick, nice!
[14:44:11] <isaranto>	 I'm gonna keep some notes, I think I faced the same or similar issue 1-2 months ago
[14:45:04] <isaranto>	 `Fool me once, shame on you. Fool me twice, shame on me`
[14:46:24] <klausman>	 The commented-out registry does mean that everything that the image needs for building neds to already be local, unless it's on the default registry (i.e. not the WMF one)
[14:57:39] <wikibugs>	 06Machine-Learning-Team: Simplify dependencies in hf image - https://phabricator.wikimedia.org/T369359#9956896 (10isarantopoulos) I'm trying the above change in a new version of base pytorch image (`docker-registry.wikimedia.org/amd-pytorch23:2.3.0rocm6.0-3`) and then use that one for the huggingface image. I'll...
[15:07:36] <wikibugs>	 (03CR) 10Jdlrobson: [C:03+1] Remove a duplication of selectors in ext.ores.highlighter [extensions/ORES] - 10https://gerrit.wikimedia.org/r/1052157 (owner: 10Ebrahim)
[16:14:07] <isaranto>	 a, I see some nvidia packages installed again 
[16:14:07] <isaranto>	 pff
[16:14:24] <isaranto>	 anyway, will continue I think this direction will work
[16:14:33] <isaranto>	 logging off folks, have a nice weekend!