[06:22:33] <_joe_> jhathaway: what do you need to do? [06:27:03] <_joe_> Also, hi everyone who manages a k8s cluster at the WMF! I think it's time we work on our tooling to make creating new charts more flexible - I don't think the scaffolding needs to be the same for a machine-learning model running in kubeflow and a server-side web frontend service. https://phabricator.wikimedia.org/T292818 was the original task. My current implementation proposal [06:27:06] <_joe_> https://gitlab.wikimedia.org/repos/sre/sextant/-/blob/scaffold/README.md#create-a-new-chart-from-scaffolding-models would welcome your comments [08:05:33] _joe_ o/ Ilias (in my team) is looking for packaging a simple fast-api python app (the ORES legacy service that calls Lift Wing behind the scenes), we can probably be your testers for sextant [08:06:09] <_joe_> elukey: the patch is still not merged though :) [08:06:17] (it will be something not managed by the istio/knative/kserve mess ehm wonderful stack) [08:06:22] <_joe_> I had some doubts about its flexibility [08:06:56] jayme: elukey: should we remove 1.16 version checks from kubeconform ? jhathaway will probably appreciate it ;) [08:07:02] we can try it somewhere and see how Ilias feels about it, to have a non-sre feedback as well [08:07:10] akosiaris: +100 [08:07:36] * elukey imagines when Alex will say the same for 1.23 a couple of months from now [08:14:11] * elukey sees also Janis running away screaming after what I just said [08:14:17] lol [08:18:36] <_joe_> akosiaris: yeah, although jesse has added support for partial version testing in kubeconform CI tests [08:19:11] <_joe_> which is a good thing - I might just at some point add a "minimum accepted version" so that we don't allow people to leave stuff out [08:19:27] <_joe_> Also, at some point we should discuss the repository structure for deployment-charts [08:19:54] <_joe_> does it make sense to keep all charts in one directory? or should we create a per-cluster directory for charts/deployments/etc? [08:20:46] <_joe_> my point being: please people challenge assumptions. We created that repository when a) we had zero kubernetes clusters in production, instead of 9, and mostly we had no idea what we were doing [08:21:08] <_joe_> b) Even when we know what we're doing, we still get things wrong [09:30:30] akosiaris: yeah, we can probably drop support for 1.16 now [09:31:55] the partial version CR I did not look closely at by now unfortunately but I think it's a nice addition and can probably bring down CI runtime as well - if we only check particular versions [09:37:41] <_joe_> ah you mean "reduce" [09:37:56] <_joe_> I read your sentence as "bring down CI" :D [09:37:58] lol, yeah. sorry [09:38:10] <_joe_> I was very confused :D [09:38:16] btullis: how did spark go? Unfortunately my bouncer catched fire yesterday - I've no backscroll :/ [10:16:26] also would like to know.... :) [10:36:25] Oh, it went pretty well, thanks. I think we're not far away from having worked out pi to 2 decimal places :-) [10:37:22] The main gotchas with the deployment were to do with deploying the changes to admin_ng/rbac-rules and admin_ng/namespaces [10:39:26] One was related to the fact that we had switched the sparkoperator to a systemnamespace, so it had some trouble when a quota resource had been taken away [10:41:15] The second was related to the existence of a deploy systemaccount in the spark namespace that couldn't change its RoleBinding. [10:42:53] So the spark-operator is now running , but there is an issue with its container in that command arguments aren't getting passed through the entrypoint.sh properly to the `tini` command line. nfraison is looking at that now. [10:43:43] I've noticed a slight botch-up on my part in the chart, whereby the image version was specified incorrectly and therefore a slightly old version has been deployed. I'm fixing that as we speak. [10:44:06] btullis: how did you fix the namespace change? [10:45:45] If memory serves correctly, it was `kubectl delete RoleBinding -n spark deploy`followed by a normal `helmfile -e dse-k8s-eqiad -l name-namespaces sync` [10:46:47] correction `name=namespaces` [10:50:47] Here was the error that had us stumped for a while: https://phabricator.wikimedia.org/T318926#8676771 - That `kubectl delete` operation unblocked us. [10:51:18] ack [11:32:03] Hello, I hope this is useful. I fixed up my old k8s app logs opensearch (logstash) dashboard: https://logstash.wikimedia.org/app/dashboards#/view/7f883390-fe76-11ea-b848-090a7444f26c [11:32:20] you can select k8s cluster and namespace (and appĀ label if necessary) and see logs [12:49:33] Nice, thanks ottomata :) [17:03:04] are secrets only avaliable inside of chart templates? The jaeger chart wants a secret supplied as a value or an exisiting secret ref, but neither seem possible with our setup, or am I missing something? [17:04:29] They are stored as objects in the k8s API, so i f you know the name you can use it from any chart [17:04:45] But maybe I am misunderstanding the question [17:05:34] We can supply a secret as a chart value btw, that is totally doable [17:05:42] akosiaris: secrets we add to private repo? [17:06:29] Ah, now that is more involved, but yes we support that too [17:06:37] this is the broken code in question, https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/888761/10/helmfile.d/admin_ng/values/aux-k8s-eqiad/jaeger-values.yaml [17:07:08] Is there another way to inject secrets, other than the private repo? [17:07:54] the jaeger template takes a password value or a reference to a kubernetes secret [17:10:03] Yeah you should use the private repo for that. It can be uses to populate stanzas on the deploy host under an /etc/helmfile hierarchy and helmfile will feed those to helm charts [17:11:44] thank, I got that far, but then I couldn't see a way to feed a value from a secret to a chart [17:12:04] all the existing charts, seem to look up secrets in their templates [17:12:44] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/admin_ng/cert-manager/helmfile.yaml#57 [17:14:14] Then they are just typical helm chart values [17:15:56] hmm, ok, so instead of templating a value, I would put the key and value in the private repo, storage.elasticsearch.password: 'foo'? [17:17:10] May I try to answer in an hour or so?, I got 2 kids crying right now [17:17:59] sorry, of course, go take care of you children! [20:40:39] jhathaway: and just managed to put them to sleep. [20:40:41] So, https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/896177 [20:41:07] That's the structure you need to put in /srv/private/hieradata/role/common/deployment_server/kubernetes.yaml on puppetmaster1001 [20:41:28] I see you already have config.private.logs_api_password in there, so it should be an easy change [20:43:33] that way you will populate jaeger-elasticsearch Secret k8s resource that the chart has with the correct password. Then e.g. query-deploy.yaml template will populate the deployment with an env var named ES_PASSWORD referencing that value [20:44:12] hope that helps [20:46:03] btw, general rule chart wise, you can no put {{ .foo.bar.baz }} syntax in values.yaml files. Only files under templates/ directory in the chart support that syntax. helmfile.yaml files also does support that syntax but reference values files don't. Referenced helmfile.yaml files, do [20:47:02] Thanks for the detailed explanation akosiaris [20:59:03] yw. I just hope I helped. For what is worth, you were already there, that private repo is pretty confusing though. Takes a while to get used to it.