[07:30:32] <moritzm>	 I'm going to be rebooting the codfw bastion (bast2002) in 5 minutes
[07:30:47] <moritzm>	 let me know if that's inconvenient for anyone
[07:33:48] <hashar>	 good morning :)  We could use a puppet-merge for the scap script which sync the deployment hosts. That is to use `rsync --new-compress`  https://gerrit.wikimedia.org/r/c/operations/puppet/+/774824
[07:34:11] <hashar>	 we already have that option set in scap itself but I have missed the `scap-master-sync`  is provided by Puppet :]
[07:56:33] <_joe_>	 hashar: can look in a few
[07:57:53] <hashar>	 \o/ :]
[08:26:45] <_joe_>	 hashar: DRY pls :P
[08:26:55] <_joe_>	 I know it's not your fault
[08:27:41] <hashar>	 yeah I felt like writing a rsync function or add an env variable to pass all the options
[08:27:50] <hashar>	 then felt it was way easier to just copy paste :D
[08:28:58] <_joe_>	 hashar: yeah merging now, but yeah :)
[08:39:54] <_joe_>	 hashar: done, btw
[08:47:32] <hashar>	 _joe_: merci beaucoup!
[13:13:07] <kormat>	 volans: for your admiration, the command at the top of https://phabricator.wikimedia.org/T303174
[13:14:02] <volans>	 lol
[13:14:18] <topranks>	 very impressive :)
[13:14:36] * kormat bows
[13:14:47] <volans>	 you know we also have https://debmonitor.wikimedia.org/kernels/ :-P
[13:15:13] <kormat>	 volans: can you estimate how long it would take to get the same result using that?
[13:15:29] <volans>	 5 minutes
[13:15:35] <kormat>	 reeeally. do tell.
[13:15:49] <volans>	 click on kernel, filter with 'db', copy+paste
[13:16:06] <volans>	 repeat for the few kernels you are interested
[13:16:25] <kormat>	 ah. so entirely manual copy+pasting
[13:16:40] <kormat>	 and then editing it into the phab format
[13:16:49] <kormat>	 i want something i can run multiple times a day
[13:17:08] <btullis>	 n
[13:17:12] <kormat>	 keeping track by hand of which hosts are done/not done is just too error-prone
[13:17:21] <btullis>	 Oops, wrong window.
[13:18:48] <volans>	 yeah agree, I thought was a one-off
[13:20:04] <volans>	 the other option, just to mess with your sanity, would be to use the json output of cumin and do it in jq :-P
[13:20:41] <kormat>	 oh dear goddess no :P
[13:20:47] <kormat>	 though i do appreciate the 'thought'
[13:21:00] <bblack>	 jq is always hard to wrap your head around :)
[13:22:00] <bblack>	 but I tend to use CLI solutions too, just because they're more-composable into bigger things.  If I start at debmonitor in a browser, I've gotta paste the data back somewhere into a file eventually to do something with it.  The workflow is already there on the CLI
[13:22:07] <Emperor>	 character-building, is jq's UI...
[13:22:22] * volans take a note for a CLI for debmonitor :D
[13:23:09] <bblack>	 that's the one thing I find most-frustrating about netbox, too.  It's a GUI-first tool, or at least has GUI-first documentation for us in practice.
[13:24:39] <volans>	 netbox APIs allow to do almost everything, but are a bit painful because of pynetbox that is not "magic" enough to convert the REST API into something totally useful
[13:24:51] <volans>	 I end up often to play with nbshell (netbox django shell)
[13:24:52] <Emperor>	 jq is a bit like sed/awk in that you read it later and go "what does that do?"
[13:28:03] <bblack>	 yeah, CLI-ness or GUI-ness tends to pervade a design in general.  I can't think of great examples of tools that do both amazingly without some tradeoff in some direction.
[13:28:13] <bblack>	 or API-ness I guess, is another
[13:37:18] <volans>	 just because I was nerdsniped into it... one-liner in the task :D
[13:37:59] <kormat>	 hahaha
[13:38:14] <volans>	 phab-compatible output ;)
[13:38:32] <Emperor>	 nice :)
[13:38:59] * Emperor resists urge to golf
[13:39:17] <kormat>	 volans: impressive :)
[13:40:18] <kormat>	 ... how are you sorting the output?
[13:41:08] <kormat>	 `-o txt` does that automatically?
[13:41:41] <volans>	 yes
[13:41:49] <kormat>	 very cute.
[13:41:50] <volans>	 both txt and json output do sort keys
[13:42:12] <kormat>	 i'll have to remember that
[13:42:19] <volans>	 and I'll have to document it :D
[13:42:26] * kormat grins
[13:58:32] <bblack>	 does cumin and/or cookbook stuff yet have some kind of dc/cluster -aware shuffler?
[13:58:37] <bblack>	 I was thinking about that yesterday
[13:58:57] <bblack>	 something similar in spirit to the cron_splay thing (but hopefully far less ugly heh)
[14:00:45] <volans>	 bblack: for cumin on the general use case we have the very old T164587 where I think an agreement was not found, but could be resumed
[14:00:46] <stashbot>	 T164587: cumin could use randomization/splay options - https://phabricator.wikimedia.org/T164587
[14:01:05] <bblack>	 (the idea is take a given hostlist, break it up into per-dc sublists based on the first digit of the numeric part, shuffle randomly in each sublist, then zipper them back together for a pseudo-random order that rotates between DCs fairly
[14:01:06] <volans>	 for any wmf-specific logic it would be better to put it into spicerack that's not a general purpose tool
[14:01:09] <bblack>	 )
[14:01:24] <volans>	 s/based on the first digit of the numeric part/on netbox's site property/
[14:01:35] <bblack>	 sure :)
[14:02:41] <volans>	 I can foresee different services needing different shuffling algorithms though, like do first one dc then another, or do first replicas and then masters, etc...
[14:03:00] <bblack>	 yeah, especially stateful thing
[14:03:02] <bblack>	 s
[14:03:30] <bblack>	 you could have a generic concept of a shuffle strategy and then various common ones or whatever, too
[14:04:00] <volans>	 I guess we could put something into the logic behind the rolling operations
[14:04:14] <volans>	 I guess is where it would be mostly needed
[14:04:32] <bblack>	 but for stateless services, there's a common pattern of "I want to hit all these servers, but do it in the least-disruptive way by paying attention to redundancies".  Commonly for us, the important subdivisions would be the site, and maybe the cluster or profile name.
[14:04:53] <bblack>	 e.g. if we operate on all cpNNNN, split by text-v-upload and per-dc, to do the rotation/shuffling.
[14:05:17] <bblack>	 trying to bring that back to something more generic and cumin-appropriate, though
[14:05:38] <bblack>	 maybe something based on IP addresses would work, in some sense
[14:05:47] <volans>	 cumin has FQDNs, so just grouping by the way clustershell groups might be already useful
[14:06:17] <bblack>	 (you could think of something sort of like hamming distance that's specific to IPs, that gives a measure of how "close" they are to being on the same network without actually knowing all the subnet info explicitly)
[14:06:46] <bblack>	 (and then sort such that it maximizes the avg distance from each list entry to the next)
[14:08:27] <volans>	 to practically explain my previou statement: https://phabricator.wikimedia.org/P24161
[14:08:29] <bblack>	 hmm that actually works if you just treat IPs like raw numbers in network order and then go for average distances, too
[14:09:03] <volans>	 note the 2 groups in ulsfo because of the missing host in the middle ;)
[14:09:13] <bblack>	 heh yeah
[14:10:45] <bblack>	 I can't think of an efficient way to numerically sort for maximizing the gaps between neighbors, though
[14:10:45] <volans>	 also, technically speaking NodeSet is a set, so has no concept of ordering
[14:10:51] <bblack>	 probably any such sort scales terribly :)
[14:43:43] <cdanis>	 volans: we need a hall of shame for awesome* one-liners
[14:43:53] <cdanis>	 (*: in both senses of the word including and especially the original)
[14:44:11] <volans>	 eheheh :)
[15:11:18] <Emperor>	 cdanis: I have some properly awful ones lying around
[15:11:27] <cdanis>	 same!
[15:12:34] <Emperor>	 https://github.com/wtsi-ssg/ceph-disk-utils/blob/dbec53ae865196199c4f15fdc602c83efc6a57ea/ceph_remove_failed_osd.sh#L101-L105 
[15:13:12] <Emperor>	 (in my defence, that is preceded by a lengthy comment explaining it)
[15:18:25] <Emperor>	 the use of jq <<< '{}' to emit JSON output is kindof vile too
[18:37:14] <Krinkle>	 https://github.com/grafana/grafana/pull/35104
[18:37:31] <Krinkle>	 Too bad the redirects are now broken. Our dashboards are linked in quite a few places, including e.g. other people's blog posts externally.