[07:28:18] thanks for the update US-TZ oncallers :D [07:29:25] they look in beter shape right now, we'll keep an eye on it [08:18:10] fyi I'm going to disable puppet globally to deploy safely https://gerrit.wikimedia.org/r/c/operations/puppet/+/1017047 [08:28:07] k [08:30:00] and done, re-enabling it [08:31:19] volans: I just got this https://phabricator.wikimedia.org/P60375 while doing a reimage, should I just start it again? The previous reimage went fine [08:31:49] checking [08:32:25] The host is up and accessible [08:32:34] it might be because of my global puppet disable? [08:32:42] Ah right! [08:33:11] if that's the case it should be good to go now [08:33:38] it failed to run puppet on the cumin host [08:33:39] Running Puppet with args --quiet on 1 hosts: cumin1002.eqiad.wmnet [08:33:44] Yeah, I am fine issuing a reimage if that's easier than cleaning up the state the host is now, I will wait for volans' opinion [08:34:50] volans: It is probably coming from XioNoX disablement. I just ran puppet again on the host and it was fine [08:35:09] could the cookbook show the error that command returned (if any)? [08:35:26] it skipped a bunch of steps, if you can re-run it with --no-pxe [08:35:29] it should be uick [08:35:36] volans: doing, thank you :) [08:35:48] or try/catch it and ask the user what to do? (eg. retry) [08:35:50] * volans hopes that mode still sowkrs [08:36:18] volans: looks like it is working yeah [08:36:37] XioNoX: it runs puppet with quiet=true, run-puppet-agent doesn't output anything I guess in that case [08:38:00] ah, I guess we need a quiet=truebutnotforerrors [08:39:44] not sure puppet has it :D [08:44:08] yes it's the puppet agent that returns 1 and fails with no output... [09:53:47] fyi, if you're curious, there are 2 test VMs (testvm2006 and 2007) on the routed Ganeti infra. Feel free to play around with them (or try to break stuff), you can even migrate them etc... [09:53:56] https://www.irccloud.com/pastebin/BrsTh0F5/ [11:13:31] XioNoX: I'm very interested in this. Have you got a ticket or a doc I could read, please? [11:15:38] btullis: the best entry point is https://wikitech.wikimedia.org/wiki/Ganeti#Routed_Ganeti [11:15:57] Many thanks.# [11:39:48] Can I get an (hopefully quick) review on https://gerrit.wikimedia.org/r/c/operations/puppet/+/1019005 ? [11:41:51] thx! [12:10:32] just a heads-up, I'm about to slowly increase the concurrency on videoscaler jobs after our backlog yesterday [12:17:14] hnowlan: ack, I wanted to ask you about it as the graphs looks much better [12:22:01] yeah, reducing concurrency as far as we could had a big impact [12:22:08] I'll be taking it slow [12:22:17] perfect [22:03:37] Dear future EMEA on-callers: jobrunner/videoscaler was at it again, so we dropped the concurrency some (5 and 2), and silenced the alert until Monday 08:00Z. The google doc has the added deets. <3<3 -- Americas on-callers