[13:52:28] qq: I'm getting a 503 response from envoy with response_code:NC and response_details: cluster_not_found (https://logstash.wikimedia.org/app/dashboards#/view/138271f0-40ce-11ed-bb3e-0bc9ce387d88?_g=h@89714be&_a=h@6468426). I _think_ this is coming from the k8s cluster ingress gateway, but I'm not 100% sure. Is there a way to find out what's [13:52:28] happening there? Thanks! [13:59:26] using `istioctl-1.15.7 proxy-config cluster -n istio-system istio-ingressgateway-2sw4j`, I'm indeed able to see that no cluster is listed for my services. I'll try to understand why [14:02:30] brouberol: o/ I think the link is not a permalink (see top right share -> etc..), i get to the main istio dashboard if I follow it [14:03:49] oops, let find a better link [14:04:24] https://logstash.wikimedia.org/app/dashboards#/view/138271f0-40ce-11ed-bb3e-0bc9ce387d88?_g=(filters%3A!()%2CrefreshInterval%3A(pause%3A!t%2Cvalue%3A0)%2Ctime%3A(from%3Anow-15m%2Cto%3Anow)) [14:06:41] the nice thing here is that I have two separate helmfiles deploying two different services based on the same chart, and each display the same issue, so it's probably something I did in the chart. It's consistent, at leat [14:06:45] *least [14:07:34] same issue :) [14:07:38] did you use a permalink? [14:08:18] that's weird, I did [14:08:37] Share > Permalink > Saved object [14:08:51] ah, ok, it needed to be Snapshot [14:09:09] https://logstash.wikimedia.org/goto/91f68d24ee44c8274d91a848be7f77e2 [14:19:50] it works yes, not sure what is the issue though [14:20:17] the log shows that the pod emitting it is an ingress gw [14:22:35] I'm like you here. It seems somehow istio didn't catch up on the changes in terms of virtualservice, gateways and destinationrules, and didn't regenerate the ingress gateway envoy config [14:22:46] but I'm not sure where to go from there [14:23:35] actually, there might be details in the istiod logs? [14:25:28] there is no service in the spark-history-test namespace [14:26:07] the cluster is build off of endpoints (e.g. pods) of the specified service. No service, no endpoints, no cluster :) [14:26:29] which I think is expected? We only create a service when we're not using the mesh. We have a virtualservice for the envoy sidecar, I'm not sure if that counts? [14:26:41] that's not correct [14:26:47] a service is created in any case [14:27:28] the spark-history-analytics-test-hadoop explicitely poits to it, see kubectl -n spark-history-test get virtualservices.networking.istio.io -o yaml [14:28:48] thanks, let me check [14:30:00] I'd assume "mesh" was not selected as a module at chart creation [14:31:00] ok thanks, I found what the issue was! You're correct, we were missing the service due to a misconfiguration in the values overrides [14:32:51] I think you're also missing it because the mesh.service is not included in charts/proper_spark-history/templates/service.yaml [14:33:10] that would have been the case if the module had been selected during chart creation [14:33:18] there might be other things missing [14:34:01] charts/spark-history/templates/service.yaml that is :) [14:35:30] indeed, that is exactly was was missing: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/983422 [14:35:46] as said...there might be more [14:36:35] might be worth it to create a new with the required modules and merge the spark-history changes on top of that to be sure not to mis anything [14:36:41] I created a new dummy chart with generic-app, ingress and service-mesh and diffed the templates and vendored macros, and that's what stood out [14:36:53] ack, okay [14:38:03] brouberol: you need a chart version bump for that change to take effect [14:38:49] {{done}} thank you [14:46:38] much obliged, everything is working perfectly now [14:54:27] nice