[09:32:22] jynus: https://gerrit.wikimedia.org/r/c/operations/puppet/+/931880 [09:33:27] I don't think that should go in the backup module [09:34:29] in fact, I think there is that already defined, if not deleted [09:35:09] ok, I mostly copied from the openldap one [09:35:34] that's the issue, it should not be copied and pasted, but shared [09:36:01] I will give it a look later, for now please test the commands by hand [09:36:21] what is the unix user executing the commands ? [09:36:39] according to that, root, which I don't like either [09:36:46] and may not work [09:38:24] in any case, thanks for sending that [09:38:43] I will give it a thought in some time to see what is the best way to get that merged [09:38:45] the scripts work just fine [09:39:22] as if everybody created a script per service it may not scale well [09:39:54] you can also address the problem when you have the problem [09:40:11] but I'm fine adapting this to other puppet layout [09:40:26] yeah, but it is putting logic on the back module that should be on the pdns one [09:40:48] don't think of a major refactor, just I need to think how to make that work [09:41:00] s/back/backup/ [09:41:07] ok, let me relocate that into the pdns module [09:41:16] don't touch it yet [09:41:27] maybe the soution is to create a single mysql script [09:41:30] I need to think [09:41:52] ok, I'll wait [09:43:31] the 3 last files I am ok with, I need to think about the first 3 [09:44:49] I could easily move that into pdns_server::backup [09:46:58] just give me some time, it won't change anything until tonight 0:-D [09:47:14] I promise you will have an answer today [09:47:26] and it will get merged [09:51:33] ok, I was hoping to merge it rather soon-ish given 1) we don't have backups and sounds scary 2) is blocking the reimage of a server that sounds scary because #1 [10:19:38] hey Emperor - do you know much about how swift buckets for private wikis work? Thumbor has different credentials for public and private wikis, but it appears that thumbor does not (and possibly never been able to) PUT the successful thumbnail as it gets a 403. I'm not sure if it's a bucket naming issue or if the wrong credentials are being used [10:20:21] To start I am wondering where the differences in configuration for the private buckets live as regards swift config [10:22:26] oh that's why opening some office wiki pages cause 429 to thumbor? It tries to thumbnail every image every time? [10:22:48] oh my god [10:23:44] yep :| [10:24:17] and afaik this might have always been the case, we're just seeing it now as a result of some k8s/xff/poolcounter weirdnesses [10:25:20] The fact that thumbor still works is impressive [10:25:34] generally speaking, not just this issue [10:26:33] yeah... full of lots of good but flawed ideas that never got road tested until it was too late [10:29:53] arturo: if you have manually created a backup (and are sure it works), I don't see how merging that will improve things as backups only run by night [10:32:06] jynus: I was hoping to merge the patch and force-run a full backup immediately as a stop gap. I can host the backup in my laptop meanwhile if that's not possible [10:32:24] I can run the backup now if it is in the right location [10:32:30] without merging [10:32:37] is it? [10:33:00] is the right location the location that would later be merged in the patch? [10:33:13] i.e, /var/run/pdns-backup/pdns.sql ? [10:34:28] that's not a very good location, as it may be on memory or wiped on restart (var/run) - see that's why I need a deeper review [10:34:47] for now you can move it next to the other backup set in the same host and I can run the backup manually [10:35:06] and then calmly we review the patch [10:35:20] the other is /var/run/openldap-backup -- again, I just copy-pasted :-P [10:35:27] :-( [10:36:18] garbage in garbage out, see why I am worried ? not bikeshedding [10:36:38] please move it there, tell me the job and we can run a backup [10:37:00] so that I unblock you, but this needs re-review [10:40:32] jynus: done, see https://www.irccloud.com/pastebin/2Hd4Nv2X/ [10:45:35] arturo: https://phabricator.wikimedia.org/P49462 [10:46:06] freat [10:46:08] great* [10:46:14] is that for the 2 servers? [10:47:24] yep: https://phabricator.wikimedia.org/P49462#200144 [10:48:12] now, make sure you don't use backups as a first recovery location, eh! [10:48:56] specially for a reimage, where a puppet key change will require a harder recovery [10:49:16] and specially in an unreviewed backup script [10:56:29] jynus: thanks, really appreciated [10:57:01] I will soon see how to solve the regular backup setup [11:15:23] so I think this was just a missunderstanding between goals- you wanted a reimage, you asked for a patch review; if you had told me earlier about your blockers we could have done this earier :-D [11:17:38] I've share information as the topic has been unfolding, in the context of T339894 [11:17:38] T339894: cloudservices: codfw1dev: fix backups - https://phabricator.wikimedia.org/T339894 [11:18:08] I definitely did not expect the code review to go in the direction of some refactor [11:46:37] hnowlan: [sorry, was AFK] no, I'm afraid I don't know anything particular about that at all :-/ [11:47:58] (maybe best make a phab item with what buckets you think it should be using, and we can have a look at what credentials &c those need) [13:33:56] arturo: I am checking the patch and the issue is that openldap makes sense because it is a backup of openldap [13:34:16] what doesn't make sense to me is create a separate mysql method for each service [13:34:35] we already have 2, one for production and another for non-production [13:48:48] Emperor: ack, will do [14:10:45] jynus: if the non-production one works for the PDNS db, then that would be great [14:11:15] jynus: does the puppet code in the non-production case takes care of everything? do I have to create the grants by hand? [14:11:16] we don't want to continue using that, it is food for today, hunger for tomorrow