[07:56:12] <_joe_> I'm having problems connecting to our apt repo from home [07:57:14] <_joe_> nevermind I can connect again [07:58:11] I got the same to other of our websites and shells for a minute or so [08:40:48] jelto, mutante o/ I see an alert for "Puppet CA certificate phabricator.discovery.wmnet is about to expire", but we are already using cfssl so I think the old cert is just to be cleaned-up. Pinging you to get the confirmation, not urgent when you have time :) [08:41:50] uh yes, thanks for spotting this. Let me take a look [09:53:01] The old non-cfssl phabricator cert which was removed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1021477 expires on Aug 7 2024. The new cfssl certificate expires on Jul 28 2024. [09:53:01] The Prometheus metric (which the alert is based on) puppet_ca_cert_expiry{subject="phabricator.discovery.wmnet"} also reports Aug 7 as the expiry date. So we have to remove the old certificate from $somewhere. [09:53:01] But I was not able to find where to delete the ceretificate on the puppetmaster. [10:30:53] <_joe_> jelto: I'm having problems with gitlab ci [10:31:07] <_joe_> specifically kokkuri's pre stage seems stuck waiting for a pod [10:31:16] <_joe_> ContainersNotReady: "containers with unready status: [build helper istio-proxy]" [10:36:09] this is happening on the cloud runners RelEng is maintaining. Let me check what's going on there [10:36:09] One workaround would be to add tags: [wmcs] to the job to make sure it's scheduled on WMCS and not kubernetes [10:37:45] what's the current support of profile::pki::get_cert() to WMCS environments? [10:38:14] do we need to deploy a PKI instance in our cloud environment if we have puppetization that depends on profile::pki::get_cert()? [10:39:03] <_joe_> I would guess so, poontoon should make it manageable I think? [10:41:07] vgutierrez: there is a pki project on wmcs, with various instance, not sure about the details though [10:41:31] <_joe_> jelto: so just add tags: [wmcs] to my pipeline definition in .gitlab-ci.yml? [10:42:20] yes that should force scheduling the job on WMCS. I'll look into the Kubernetes cluster and reach out to releng and see what's going on there [10:42:40] volans: yeah.. I've seen that, just asking cause right now profile::cache::purge unconditionally uses pki::get_cert() [10:43:28] volans: but I don't know if the idea is that the traffic WMCS project uses the PKI instance from the pki project or if we should deploy ours [10:44:49] vgutierrez: sorry, don't know, 301 to I/F [10:46:46] volans: ack, asking there, thx <3 [10:47:14] <_joe_> jelto: it got unstuck now, but yeah it seems to be a tad slow [10:48:31] vgutierrez: o/ deployment-prep uses the intermediate CAs from the project that vol*ans mentioned, but there is a catch - you need to be able, in the target host where you run the get_cert(), to use the right secret-id in the cfssl client. [10:48:57] In deployment-prep I added the config to it private repo (as real private commit, local to the puppet master only) [10:50:10] elukey: can you point me to a commit ID? [10:51:09] (need to run an errand, will do in a bit sorry) [10:51:40] no problem [11:14:41] (answered in pvt) [13:08:05] does anyone know what options are valid for `cfssl_label` in the envoy.pp? (ref https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/tlsproxy/envoy.pp#82 ). I need a cert for an internal service, so I'm guessing I want 'discovery'? [13:08:34] err..."values" not "options" [13:11:23] the doc says: [13:11:23] @param cfssl_label if using cfssl this parameter is mandatory and should specify the CA label sign CSR's [13:12:32] yeah, I'm having trouble parsing that comment. I take it to mean "specify the CA that would sign your service's CSR"? [13:20:20] grepping thru puppet I see 'cfssl_label' set to 'discovery', 'debmonitor', and 'cloud_wmnet_ca' . And the issuer CN for a working cert from wdqs2022 is 'discovery' [13:20:31] I don't think we specified that anywhere, so maybe it's actually implicit [14:16:26] arnaudb: hi! the change in the prometheus mariadb exporter broke it for toolsdb [14:16:45] arnaudb: I checked and we are using the latest available version on bullseye, the problem is that it does not like the new cli flags [14:16:56] T369722 [14:16:57] T369722: [toolsdb] 2024-07-10 ToolsToolsDBWritableState alert page - prometheus-mysql-exporter down - https://phabricator.wikimedia.org/T369722 [14:17:33] (talking about https://gerrit.wikimedia.org/r/c/operations/puppet/+/1048006) [14:20:19] dcaro: ack [14:20:39] hmm, I think backports have low priority in those nodes