[14:32:36] jayme: I've see that helmfile diff --detailed-exitcode exits with 0 if there's no diff and >0 if there is one. We could leverage this to alert on the status of a systemd timer that checks for admin_ng pending changes [14:33:22] we could have one such script per k8s cluster, and map k8s cluster with an IRC cluster to alert in, or something like that. WDYT? [15:31:40] brouberol: janis is on PTO this week [15:32:50] what does "pending" changes mean in this context though ? [15:36:47] I think he was asking about the case of live-on-cluster not matching what's committed at HEAD [15:49:15] ah [15:49:33] so diffing every now and then all releases for admin_ng [15:49:58] hmmm, it's not a bad idea [15:55:22] I wonder if that's the same flux&argocd do for their gitops reconciliation flow [15:57:59] I may reuse this systemd timer idea actually, for toolforge k8s [16:41:19] I meant changes merged to admin_ng but undeployed to the cluster, yes [17:46:57] akosiaris: the context is https://phabricator.wikimedia.org/T331894. When the infrastructure behind a given cluster/service changes (eg refresh of kafka brokers), the IPs will change, which will ultimately be reflected as a diff in admin_ng. If we don't apply this diff in a timely manner, then we could open ourselves to some outages [17:47:17] either because we don't have the right network policies in place, or because the service resolves to old/un-used IPs [17:47:46] it was discussed in a SIG meeting that we could/should introduce a mechanism that would alert on pending diffs, basically