[08:00:54] I'm seeking a quick +1 on https://gerrit.wikimedia.org/r/c/operations/puppet/+/946511 (alert hosts short on root filesystem space) [08:02:36] <_joe_> godog: on it [08:03:31] thank you _joe_ [08:05:19] also TIL 'docker system df' [08:06:07] Adding a change as well :) https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/946510 [08:06:31] I'd need to start my week with a mw deployment if people agree, we are breaking RCfilters for cswiki users [08:06:53] <_joe_> please go [08:07:17] thanks :) [08:14:43] _joe_ to double check (if you have a moment) - scap backport told me about https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/945883, I guess it is ready to go but I'd like to double check [08:15:01] TheresNoTime: --^ [08:15:16] <_joe_> yeah I'd ask TheresNoTime :) [08:15:42] elukey: please continue, I didn't mean to +2 that and won't be able to properly deploy it for a few more minutes [08:15:49] It's safe to scap [08:15:54] super thanks! [08:27:51] <_joe_> TheresNoTime: wait, I think elukey will deploy your patch too [08:28:00] <_joe_> err that's too late is it [08:28:33] _joe_ IIUC it was safe for scap [08:28:48] I've read it as "yep ok to be deployed" [08:28:51] Yep, I was going to finish the deploy around now [08:29:04] But you saved me a job :D [08:29:10] :D [08:29:16] almost done [08:31:35] {{done}} [08:33:10] (ack from my side, thank you!) [13:29:55] hello folks [13:30:04] I am going to roll out another change for mw: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/946546 [13:32:41] jayme, jynus --^ [13:33:04] it is to revert the ORES extension to its prev behavior (namely calling ores sigh) [13:33:10] we are impacting some users atm [13:33:35] I think there is a deployment ongoing [13:33:42] but no blocker from me [13:34:09] (maybe it is your deployment :-D) [13:34:22] nono I have to start [13:34:24] mmmm [13:34:32] how should I check? I saw one but it finished [13:35:16] jynus: -^ [13:35:26] (just to avoid starting multiple things) [13:35:41] ask urbanecm who's currently deploying stuff? [13:35:56] Ah well, sorry for the ping :') [13:36:26] already doing it in #operations :) [13:36:57] yep [13:37:02] he is cordinating it [16:07:36] I've got a ticket for TLS cert expiring https://phabricator.wikimedia.org/T343319 ... trying to figure out how to check the actual cert in production, curling `search.svc.codfw.wmnet` doesn't seem to work anywhere. The cert in the puppet repo doesn't match the expiration date on the alert [16:07:52] LMK if anyone has suggestions [16:08:27] Have you tried `openssl s_client -connect hostname:port` ? [16:08:35] Have you checked the cert in the puppet ca ? https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/PuppetStaleCertificates [16:08:47] rgh wrong page [16:09:18] btullis yeah, no response...tried from bastions, cumin, search servers [16:11:23] claime I have not... is this the right link? https://wikitech.wikimedia.org/wiki/Cergen#Usage_on_Puppet_CA_host [16:12:19] If that particular cert is defined there yes, I think so [16:14:02] interesting. The "correct" cert, by which I mean the same cert in the puppet repo that has an expiration date in 2027, is present at `puppetmaster1001:/srv/private/modules/secret/secrets/certificates/search.discovery.wmnet` [16:14:57] isn't this the cert for search.discovery.wmnet that answers on port 9243? [16:15:11] So I'm still a bit confused as to where prometheus is getting the expiration date [16:15:33] volans ah, good catch. Thanks! [16:17:53] I can connect to the host on port 9243, but I get the "correct" cert...hmm [16:18:54] I guess it's time to start digging into the prometheus config? Unless someone else has an idea [16:21:14] inflatador: Did you check the linked runbook ? https://wikitech.wikimedia.org/wiki/Puppet#Renew_agent_certificate [16:21:33] You should be able to find where the cert is, it looks like it's an old puppet ca cert imo [16:23:08] root@puppetmaster1001:~# openssl x509 -enddate -noout -in /var/lib/puppet/server/ssl/ca/signed/search.svc.eqiad.wmnet.pem [16:23:10] notAfter=Aug 22 10:28:57 2023 GMT [16:23:12] yup [16:23:25] If it's not the certificate that is used, it can probably be purged [16:25:29] claime how do I do that safely? I assume it's a `puppet cert` command? [16:27:54] probably puppet cert clean search.svc.{eqiad,codfw}.wmnet but I'd be happy for someone to double check me [16:29:36] `puppet cert print` def shows the old, expiring certs [16:46:22] I'm gonna go ahead and clean the codfw cert and watch for alerts...hit me up if y'all notice anything [17:02:01] OK, that cleared the alert and didn't appear to create any new ones...cleaning out the eqiad cert next [23:16:21] _joe_: effie FYI, memc-onhost-tier is now unused. https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/937197