[07:41:46] <elukey>	 hnowlan: o/ I see a lot of alerts for tilerator on maps hosts, expected or something ongoing? 
[07:42:28] <elukey>	 arturo: o/ I see a lot of crits for cloud* nodes, mostly silenced (but popping up in the icinga page), can we ack them?
[07:43:13] <elukey>	 (just to reduce the noise on the page, it seems a Christmas tree at the moment)
[07:44:48] <_joe_>	 elukey: the tilerator stuff I think is expected because we're dismissing cassandra and tilerator on maps
[07:44:57] <_joe_>	 yeah I've given up on that
[07:45:08] <_joe_>	 it's clear people are uninterested
[07:47:19] <taavi>	 elukey: cloud1* or cloud2*-dev nodes?
[07:47:56] <elukey>	 taavi: cloudcontrol*-dev mostly
[07:48:14] <elukey>	 _joe_ ack perfect, I'll follow up with Hugh to ack those alerts then
[07:48:53] <taavi>	 those are known, feel free to ack them (I don't have access for that myself)
[07:49:04] <taavi>	 fallout from our bullseye upgrades
[07:49:37] <elukey>	 taavi: super, is there a task that I can use?
[07:50:29] <taavi>	 https://phabricator.wikimedia.org/T300254
[07:51:06] <_joe_>	 taavi: I don't think it's elukey's duty to ack them, frankly
[07:51:19] <_joe_>	 but rather of the people operating and maintaing them
[07:51:48] <_joe_>	 a tad more responsibility in managing icinga by everyone is needed.
[07:58:51] <elukey>	 taavi: acked thanks :)
[08:00:58] <elukey>	 once in a while we should clean the unhandled alerts, just to reduce the noise
[08:01:07] <elukey>	 it is very annoying but needed :)
[08:02:31] <arturo>	 thanks elukey taavi
[08:02:50] <arturo>	 yes we're in the middle of bullseye upgrades
[08:39:10] <taavi>	 I fixed a few unacked wmcs alerts where I had enough access to do so
[08:48:40] <marostegui>	 I am switching m2 master in 10 minutes
[08:48:55] <marostegui>	 Affected services at: https://phabricator.wikimedia.org/T300329
[08:55:49] * akosiaris around 
[10:31:30] <hnowlan>	 elukey: weird, those were previously acked - thanks for the heads up 
[10:33:47] <volans>	 maybe they flapped? to warning or unknown for example
[10:36:30] <elukey>	 hnowlan: <3
[10:52:44] <jynus>	 BTW, if you see me today doing weird stuff, I have switched only today's clinic duty (but cannot update topic)
[10:54:11] <jynus>	 I read there is channel groups, I wonder if with that we could distribute topic rights more widely than with the flag limitations
[10:58:24] <volans>	 jynus: want me to update the topic? if it's only for 1 day not sure it's worth though :)
[10:58:44] <jynus>	 yeah, only if you promise to change it back tomorrow
[10:58:50] <jynus>	 if not, it is ok
[10:59:07] <jynus>	 but I am more "worried" for updates on an outage or something else
[10:59:57] <jynus>	 and if it is not possible just on irc, maybe we can call our software developer to create a form
[11:01:31] <taavi>	 https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia/irc/ircservserv-config/+/refs/heads/master/channels/wikimedia-operations.toml is the canonical list these days, I think you should be able to just send a patch to that repo
[11:02:37] <jynus>	 I was told there was a limit on the number of people that could be added to operatos since the libera transition
[11:02:44] <jynus>	 *operators
[11:03:33] <taavi>	 I'm not aware of any limits on operators (but there is a limit of 4 "founders")
[11:04:42] <volans>	 jynus: topic changed, I can change it back tonight or tomorrow morning
[11:05:02] <jynus>	 let me see if I can fix it myself for the future
[11:18:57] <jynus>	 https://gerrit.wikimedia.org/r/c/wikimedia/irc/ircservserv-config/+/759312
[15:01:16] <Emperor>	 the lack of alphabetical order makes me sad
[15:01:28] <Emperor>	 (why yes, I do need to get out more. Why do you ask?)
[15:11:12] <Reedy>	 {{sofixit}} ;P
[15:13:20] <Emperor>	 I figured I should let j.nus' CR through first otherwise I'll only have to fix a merge conflict :)
[15:15:24] <volans>	 Emperor: you can just do your patch on top of ja.me's one ;)
[15:18:24] <_joe_>	 volans: when picking a python library for spicerack, should I look at what's available on bullseye or on buster?
[15:21:26] <volans>	 _joe_: we're currently on both buster and bullseye (cumin1001 upgrade is pending some unrelated work)
[15:21:52] <volans>	 but if you need something in bullseye backporting it to buster with python is usually super trivial
[15:22:34] <volans>	 (T276589#7420124 for reference)
[15:22:34] <stashbot>	 T276589: migrate services from cumin2001 to cumin2002 - https://phabricator.wikimedia.org/T276589
[15:24:10] <_joe_>	 volans: heh I'm considering how to tackle writing a spicerack kubernetes module
[15:24:19] <_joe_>	 one option I just shellout to kubectl
[15:25:15] <_joe_>	 another is I choose any of these python libraries that allow talking to kubernetes
[15:25:37] <_joe_>	 debian has one, which I fear is horribly outdated in buster
[15:25:49] <volans>	 I can imagine
[15:26:03] <taavi>	 _joe_: btw, speaking from experience, backporting a newer version of python-kubernetes than 12.0.1 (aka support for kubernetes 1.16) to bullseye is a pain since you need tons of newer dependencies too
[15:26:18] <volans>	 feel free to open a task for detailed discussion following https://wikitech.wikimedia.org/wiki/Spicerack#Adding_new_module_or_change_in_core_behaviour
[15:26:22] <taavi>	 if you decide to do that for whatever reason, https://salsa.debian.org/taavi/python-kubernetes/
[15:26:55] <_joe_>	 taavi: yeah no, I was thinking of packaging pykube_ng in case
[15:28:18] <_joe_>	 volans: I'm a bit confused, I have tons of tasks that need such a library in spicerack; do I need to open another one about what? the implementation strategy? Isn't a CR the right place to discuss such things?
[15:29:06] <_joe_>	 I'm not asking your team to implement it
[15:29:11] <volans>	 I know
[15:30:04] <volans>	 experience has thought us that when starting directly from a CR the friction to contribute to spicerack is higher because is easier to agree on the api and integration into spicerack before hand than going back and re-implement something after the review
[15:30:30] <_joe_>	 so you want to discuss the api in a task?
[15:30:43] <_joe_>	 you're worried I might not properly overengineer it?
[15:33:26] <volans>	 how to structure the spicerack side of the api, how it's exposed to the cookbooks and such
[15:33:43] <volans>	 I'm not worried, I'm saying that there is a  process :)
[15:35:27] <volans>	 I can add you to the next office hours with john and me too if you prefer to chat live about it
[17:30:15] <hnowlan>	 I had a homer timeout on a decommission operation (forgot about it in a terminal and didn't type "yes" to the prompt for a while), got a "ncclient.operations.errors.TimeoutExpiredError: ncclient timed out while waiting for an rpc reply.
[17:30:52] <volans>	 hnowlan: ok, which host was it?
[17:31:01] <hnowlan>	 volans: restbase2011 
[17:31:15] <hnowlan>	 there's a lock left by the operation afaict
[17:31:41] <volans>	 we can just re-run homer for the switch to wich restbase2011 is attached
[17:32:11] <volans>	 and that's asw-c-codfw
[17:32:27] <volans>	 I can run it for you if you want
[17:33:12] <hnowlan>	 volans: that'd be great, thank you!
[17:33:43] <XioNoX>	 volans: let me know if you need help
[17:33:49] <volans>	 hnowlan: FYI I'm just runnign from a cumin host: homer 'asw-c-codfw*' commit "Decommission restbase2011"
[17:34:07] <volans>	 let's see if it works :D
[17:34:11] <hnowlan>	 volans: nice, thanks! 
[17:34:16] <hnowlan>	 very impressed by how the script copes with repeated runs after a failure though, lots of nice handling of edge cases <3
[17:34:50] <volans>	 XioNoX: actually in this case it fails saying that the terminal is locked
[17:35:03] <volans>	 configuration database locked by
[17:36:22] <XioNoX>	 asw-c-codfw> request system logout pid 57293 
[17:36:38] <XioNoX>	 volans: you're good to go
[17:36:45] <volans>	 ack, re-run
[17:38:31] <volans>	 hnowlan: you should be good to go, all done, and the homer step is the last one 
[17:40:41] <hnowlan>	 volans: great, thanks! no need to re-run the cookbook so? 
[17:41:16] <volans>	 at this point no need I'd say, in general yes just re-running would do the trick
[17:41:26] <volans>	 it should be fully idempotent (and if not feel free to open a task!)
[17:41:40] <volans>	 if you want you can also re-run it :D
[17:42:00] <volans>	 but it should be a total noop at this point
[17:42:13] <hnowlan>	 great, thanks! 
[17:45:44] <volans>	 anytime :)
[18:33:39] <taavi>	 could someone please merge this deployment-prep only hieradata change please? https://gerrit.wikimedia.org/r/c/operations/puppet/+/759559/
[18:35:14] <jbond>	 taavi: done
[18:35:29] <taavi>	 thanks!