[09:24:18] hello folks! [09:24:43] there was a unstaged change on deploy2002's deployment-charts [09:25:00] I saved the diff to /home/elukey/deployment_charts_git_diff_14032025.diff [09:25:03] and then reverted [09:25:10] so deployments can go through etc.. [09:34:35] ack [10:27:48] <_joe_> elukey: was it related to resources in the mw pools? [10:28:37] nope, benthos/shellbox changes like [10:28:37] +bases: [10:28:38] + - ../global.yaml [10:28:38] + [10:28:57] at least it doesn't seem mw-related [10:29:04] <_joe_> oh ok [10:34:25] thanks elukey, that's probably a leftover from kamila_ [10:46:55] _joe_, claime, vgutierrez - I reviewed https://gerrit.wikimedia.org/r/c/operations/puppet/+/1123622 and it looks ok, but lemme know your opinion when you have a moment (the original patch caused the www.wikimedia.org/ redirect cache pollution mess) [10:47:41] oh, sorry, I thought I'd cleaned it up [10:56:21] kamila_: nah don't worry, I usually write in here to notify that the diff is $somewhere, so it doesn't get lost [10:56:54] thanks elukey <3 [11:02:27] In the next 30 minutes I'll be running a live-test of the maintenance/periodic job stop/start behaviour in eqiad (where there should be no jobs or maintenance scripts running in the first place). This should be a noop but cc jynus, volans just in case [11:02:49] hnowlan: ack, thx [11:02:54] <_joe_> elukey: if you, our Apache Ninja, think a patch is ok, who are we mere mortals to say it's not! [11:03:23] yeah sure after hearing this one I think my day is over :D [11:03:41] rotfl [11:04:33] <_joe_> I mean he's our apache ninja but also too nice [11:12:00] ok [11:39:42] running the live test now (only affecting the inactive mw-script/mw-cron in eqiad) [12:02:21] * volans grabs popcorns [12:06:55] all done, only some explosions (T388874) [12:06:55] T388874: Update Kubernetes library version in spicerack - https://phabricator.wikimedia.org/T388874 [12:07:05] Minor explosions [12:07:52] hnowlan: we have the debian version IIRC [12:08:02] Pleasing taste, some monsterisms https://www.youtube.com/watch?v=u2Jq_xT6DAg [12:08:10] python3-kubernetes 12.0.1-1 [12:08:31] `"kubernetes==12.0.*", # frozen to the version available on debian bullseye` even says it in the setup.py :) [16:16:41] sukhe, volans: less noise here [16:17:04] reading backlog [16:17:11] so.. as sukhe mentioned on -operations we have a tiny bug on sre.loadbalancer.admin cookbook where the repool will always fail due to a mismatch on the puppet message [16:17:48] what's the suggested approach here? forcing that puppet gets reenabled or tinkering with the puppet messages to make them match? [16:18:47] let me see [16:19:06] ok so your problem is [16:19:07] reason = f"{self._args.action} {hosts}: {self._args.reason}" [16:20:28] meh, I guess the assumption was that puppet would be disabled/enabled within the same run [16:20:37] while you disable with depool and re-enable with pool [16:21:22] that is a bit worrying as a workflow in general because if a host needs to be depooled for days that means it has to have puppet disabled? [16:21:47] or you'll have to make a puppet patch in that case to prevent puppet from restarting it and then you could re-enable puppet? [16:23:17] but yes if the workflow is this one I guess you could override _reason() [16:26:31] vgutierrez: you could also add a pre-flight check with `puppet.check_(dis)enabled()` if you want to warn the user if puppet is not in the right state before starting [16:27:01] to reduce the surprise factor of not having puppet re-enabled for example [16:32:11] Alternative approach, create a new reason with spicerack.admin_reason(f"de{reason}") :D [16:32:19] where you use reason [16:34:05] that seems to be the cleanest I think? [16:34:19] probably yes [16:34:38] depool will always disable and pool will always enable [16:35:44] sorry, you want spicerack.admin_reason(f"de{reason.reason}") [16:35:57] de? [16:36:02] for depool [16:36:12] the parent code does [16:36:12] reason = f"{self._args.action} {hosts}: {self._args.reason}" [16:36:23] so you get depool ..... [16:36:30] but in pool you get pool ... [16:36:41] the diff is "de" :D [16:43:33] so what's the cleanest? overriding _reason()? [16:43:57] I think in _pool_action() just do: [16:44:14] depool_reason = spicerack.admin_reason(f"de{reason.reason}") [16:44:19] puppet.enable(depool_reason) [16:44:40] but if you want to override _reason feel free, you can find a generic verb for (de)pooling to use in both cases [16:44:57] so we are requiring that the same user on the same cumin host performs both actions [16:44:58] like "toggle pool status" [16:45:44] right, if you want it fully the same no matter what then let's do a different thing [16:47:10] 1) define a self._puppet_reason in __init__() with the message you want [16:47:30] 2) both in _depool_action and _pool_action use that reason and pass verbatim_reason=True to puppet's disable/enable [16:47:51] it will use "reason.reason" and not the fully formatted one with user@host [16:48:40] just ignoring the reason variable that you get when the parent class calls your actions [16:48:44] that makes sense, thx <3 [16:49:32] just make clear in the message that was the cookbook, so at least one can go look at SAL [16:49:42] given there will be no user [16:53:35] ofc forcing the re-enable always could also be an option, if you feel the right thing to do :) [16:53:46] ok it's getting late, I'm about to logoff [16:53:56] <3 [16:54:06] that wont' work... I need to provide a Reason instance :D [16:54:10] not a string [16:54:36] yeah I think reason = spiceraack.admin_reason("reason") [16:55:18] yes I meant self._puppet_reason = spicerack.admin_reason("something") [16:55:31] and then [16:55:43] puppet.disable(self._puppet_reason, verbatim_reason=True) [16:55:47] same for enable [16:59:08] https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1127938 [16:59:09] thx folks <3 [17:00:30] perfect