[10:36:04] I'm trying to reimage cloudbackup1002-dev but the cookbook is failing after the first (successful) puppet run [10:36:22] I tried twice and in both cases it failed with "failed to execute command 'run-puppet-agent --quiet'" [10:36:33] jbond: ^^ dhinus wich OS? we're fixing an existing bug [10:36:41] bookworm, ganeti [10:38:08] Spicerack logs: https://phabricator.wikimedia.org/P52999 [10:39:50] dhinus: have yuo tried install_console, dod yuo know if puppet was working before yuo did the reimage [10:40:59] it was failing before the reimage because the host was running bullseye, and puppet was trying to install some bookworm packages. but it's working if I run it manually after the reimage [10:41:14] haven't checked install_console [10:41:27] the host seems mostly fine after the reimage, I can ssh and such [10:41:58] dhinus: that's unrelated to your host [10:42:09] the "failed" puppet run was on teh cumin1001 host [10:42:21] xecuting commands [cumin.transports.Command('run-puppet-agent --quiet', timeout=300.0)] on '1' hosts: cumin1001.eqiad.wmnet [10:42:35] I had disabled cumin to deploy spicerack and I've just re-enabled+run it [10:42:52] so that was probably it, you can blame me [10:43:18] it's all back to normal now (with newer spicerack), so you should be able to re-run it [10:43:28] dhinus: ^^^ [11:00:55] thanks, trying now! [11:22:16] mirrors.wikimedia.org is timing out for me, anyone seeing the same? [11:23:50] it seems to end up replying if the timeout is long enough, but apt and browsers bail out before it gets a chance [11:25:45] oh, back online [12:27:29] volans: the reimage worked, thanks :) [12:27:41] great, thanks for closing the loop [14:41:43] slyngs nice job on https://github.com/ganeti/prometheus-ganeti-exporter , you're getting some love on the Ganeti mailing list ;) [14:46:01] indeed! on the irc channel too! [14:46:50] bblack/ sukhe: I'm calling digicert back to verify the ssl order, on hold. [14:46:56] so they release it and keep it goin [14:47:04] i saw your message last evening sukhe [14:48:44] robh: awesome, thanks :) [14:49:25] estimated wait 10 min... ok im not on active hold they'll call me back with call queuing... either way i'll ping when done so ya know you can proceed [14:53:05] thanks! [14:53:25] I think once they unblock on that (assuming it resolves the org verification too), we'll just be waiting on them to issue [15:00:31] bblack: callback verification done, its now last verify steps on their end nothing else from us should get a notification ready for downlaod in about 45 minutes [15:02:28] \o/ [15:02:56] thanks robh! [15:19:23] bblack: i just got the approval email [15:19:31] well, 10 minutes ago [15:20:00] yeah we have cert downloads now [15:20:36] ahh, forwarded to you but didnt need to it seems heh [15:20:42] woooo \o/ [15:20:53] ok, one monolith order done next is jcare which is in process [15:52:36] lmata: if it's ok, could you add cloudservices@ and/or sre-at-large@ as "optional" to the "Incident Review Ritual" meetings? some of us have the invite, but others don't. [15:56:02] dhinus: of course, are-at-large should be in there but I’ll check thanks for bringing it up, will do [16:01:14] dhinus: turns out operations engineers was there and not are-at-large (added now) [16:09:57] thanks lmata!