[08:51:44] #16 pushing manifest for docker-registry.discovery.wmnet/repos/data_persistence/zarcillo:devel@sha256:54ca238129d2e76d96b469349cd2df8e16c35f4abf5b172a6be391cfd50dcae7 0.6s done sounds promising, yet https://docker-registry.wikimedia.org/ is not being update immediately it seems [08:53:14] yes yes the dashboard updates every 30 mins IIRC [08:54:02] that :devel does not sound good tho [08:54:12] still see Failed to pull image "docker-registry.discovery.wmnet/wikimedia/zarcillo:2025-03-25-091801-production [08:56:15] so possibly there is another tag, like the devel one [08:57:29] "devel" is the branch name, most likely something automagically detecting the branch and using that instead of "production" [08:59:57] but it also depends what tag is getting pushed [09:00:02] checking [09:00:26] perhaps it wants a git tag? [09:02:46] so the tags can be seen via curl at https://docker-registry.wikimedia.org/v2/repos/data_persistence/zarcillo/tags/list [09:03:01] and the only one available seems "devel" [09:03:28] yes, I think is being autodetected from the branch name by something [09:03:48] yep yep [09:03:53] (`docker pull docker-registry.wikimedia.org/repos/data_persistence/zarcillo:devel` works fine) [09:04:14] would kokkuri need a git tag like "2025-03-25-091801-production" [09:04:59] also it seems to push to a path that contains /repos/data_persintence/ while k8s is looking for /wikimedia/ [09:05:13] that too yes [09:05:30] but what k8s is looking for is established via the values.yaml in deployment-charts [09:06:02] I'm not sure which side of the documentation/examples is incorrect [09:06:32] but we can keep the k8s side as it is [09:08:36] the /repos/data_persistence bit is a convention, if you build images via gitlab [09:08:41] so that will likely need to be changed [09:09:00] for the final tag there is surely a way to customize it [09:10:21] federico3: https://gitlab.wikimedia.org/repos/data-engineering/sync-utils/-/blob/main/.gitlab-ci.yml?ref_type=heads#L29 [09:10:36] this is an example [09:10:46] should be easy to be added to the zarcillo's config [09:10:52] IIRC it's the other way around. We keep the string that kokkuri generates, e.g. "repos/sre/data-gateway" and configure on the k8s (i.e. helm values) side. [09:10:54] thanks, trying [09:11:14] that allows to vary on gerrit vs gitlab keeping the configuration on the helm side. [09:11:19] +1 yes [09:11:27] even if there is PUBLISH_IMAGE_NAME etc.. [09:11:29] and yes, it's convention on both sides, to reflect the way the repos as structured [09:11:37] s/as/are/ [09:12:09] so the doc on the k8s is incorrect? ok [09:12:39] what specific docs are you referring to? We can check and review them in case [09:13:52] the main benefit of having the /repos/etc.. in the docker image name is that you can go directly to its gitlab repo without researching how it was built [09:25:19] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1138688 something like this perhaps? [09:26:28] sure, looks good for a first go. ofc down the line we 'd want the image tag to be an immutable thing, not "devel". But this is ok for a start [09:31:36] yes, for the time being I'm happier to have it correctly describing that it's under development rather than calling it production [09:33:30] ok, k8s git-pulled the repo, running apply... [09:46:17] still nothing; I'm trying to find logs from logstash -> Discover [09:57:45] federico3: Warning Unhealthy 110s kubelet Readiness probe failed: Get "http://10.67.82.139:8080/healtz": dial tcp 10.67.82.139:8080: connect: connection refused [09:57:56] I found it directly via kubectl describe pod [09:58:23] (I have an interview but I can help later if needed) [09:59:12] I found that using k8s_event.involvedObject.namespace: [09:59:12] zarcillo but I don't see logs from the container itself [11:03:31] back [11:03:38] didn't find any when I checked as well [11:06:39] federico3: one thing that I noticed is that "mesh" config for zarcillo is false, not sure if it is intended or not [11:07:25] for the python-webapp chart we usually enable it, using 8080 as the port that it proxies to (mesh basically deploys and envoy proxy as sidecar) [11:07:46] and by default all the network policies are taken care of (ingress/egress where needed0 [11:09:16] the other bit is also to know if /healtz is and enpoint that works etc.. [12:14:59] the fact that I'm not seeing any logs makes me wonder if the python application is being started at all [13:15:30] could be a problem yes, if it is supposed to print something [13:15:57] I'm making some changes and a big cleanup in the repo and adding integ tests [13:15:58] have you tried `docker run` locally with the image built ? [13:16:17] you mean on my dev workstation? [13:16:23] exactly yes [13:16:32] with what dockerfile? [13:16:32] with the image pulled from the docker registry [13:16:37] ah [13:17:31] currently I'm trying to fix the kokkuri build - do you know if it breaks without a requirements.txt? [13:17:41] no idea [13:18:00] but running the image should tell you what happens [13:18:32] I added a Containerfile for local build+test and it was starting [13:20:30] $ docker run docker-registry.wikimedia.org/repos/data_persistence/zarcillo:devel [13:20:33] python3: can't open file '/srv/app/index.py': [Errno 2] No such file or directory [13:20:42] runuser@1c3addbe9201:/srv/app$ ls [13:20:43] README.adoc pyproject.toml python requirements.txt [13:21:17] could it be missing the index.py?