[05:43:35] btullis: did you reboot dbstore1005? staging database isn't up, so we might run into the same issue we had a few months ago [06:21:52] looking at irc logs it seems to have been rebooted by Amir yesterday evening [06:22:17] mmm, the automated script I guess [06:22:29] Amir1: We need to fix that for dbstore then, as it has the staging database [06:22:31] I am going to start it [06:22:32] yeah, it seems to have treated it as a generic DP mysqld [06:23:11] I have started it, Amir1 make sure to address that case in your script [08:00:04] hello hello! we're going to do a minor Netbox upgrade, please refrain from any changes if possible. [08:14:29] can you drop a note when you're done, please? I have a ganeti makevm cookbook running, will simply have it retry when Netbox is back up [08:17:43] moritzm: should be good now [08:18:08] marostegui: I haven't reboted the dbstore100* servers yet. I'm scheduled to do so in a little over an hour from now. [08:18:36] XioNoX: thanks, indeed worked fine [08:18:51] ...but I guess dbstore1005 doesn't need it now, based on the above. [08:19:15] btullis: no need :-) 1003/1005/1007 have all been rebooted [08:20:07] 1005 yesterday, 1003 five days ago and 1007 four days ago [08:20:31] OK, cool. [08:22:01] btullis: I think Amir1 did all of them. I thought he talked to you [08:23:36] I didn't hear anything, but it's no problem. I sent out a maintenance window announcement for this morning, in case they were in use, but nobody seems to have complained anyway. All good. [09:56:40] the decommission cookbook failed for me with an import error https://phabricator.wikimedia.org/T336236#8836173 in case someone is interested cc volans [09:57:10] XioNoX: ^^ [09:57:38] arturo: thx, looking, might be related to the deploy of the netbox validators to prod [09:57:44] that happened few minutes ago [09:57:49] ok [09:58:21] checking logs [10:21:45] arturo: I'm making a fix, I'll ping once fixed so you can re-run the decom cookbook [10:22:20] volans: ok thanks [10:28:40] arturo: should be fixed, could you retry please? [10:28:48] volans: ok, doing so now [10:29:10] thx, sorry for the trouble [10:29:20] no problem at all [10:29:34] the fact that we have automation is a blessing :-) [10:30:01] volans: now getting: [10:30:03] [3/4, retrying in 27.00s] Attempt to run 'cookbooks.sre.hosts.decommission.update_netbox' raised: The request failed with code 400 Bad Request: {'__all__': ['Invalid DNS name: must be a valid FQDN']} [10:30:23] the previous run flushed the DNS apparently [10:30:29] rotfl [10:30:42] ofc, we empty it, and that's not "valid", sorry my bad [10:32:31] will hit CTRL+C to stop it in that step to avoid any more potential "half" steps [10:32:49] Sleeping for 3 minutes to get netbox caches in sync [10:32:49] ^CCtrl+c pressed [10:32:59] ack [10:38:51] arturo: sorry, retry now :) [10:39:03] ok, running [10:40:08] volans: much better this time [10:40:11] :)