[08:11:30] I can't find a way to give cumin a list of cloud vps hostnames, at any rate this is what I was thinking to start with re: machine-id [08:11:35] cumin -b1 -s60 'O{project:tools name:^tools-k8s-worker}' 'rm /etc/machine-id && systemd-machine-id-setup && s ystemctl restart systemd-networkd' [08:13:37] I tried on tools-k8s-worker-nfs-1 and restarting systemd-networkd does not bounce the ens interface, rightfully so [08:16:44] note that the above is not strictly idempotent but almost, in the sense that systemd-machine-id-setup reads the kvm uuid if present, it doesn't generate a new one each time for vms [08:18:52] morning! [08:19:09] I think that for cumin you might have to forge one of those direct queries [08:19:36] greetings dcaro ! [08:19:54] good point I'll try direct [08:20:35] iirc it was something like `D{,}` it might allow ranges though [08:20:58] hah yeah totally that's it, thank you [08:23:52] so cumin with the actual list of hostnames, only for k8s workers for now to test the theory [08:24:00] any objections? [08:26:29] go for it [08:26:51] 👍 [08:28:41] cheers! [08:42:31] mmhh ok I guess the interface does get bounced [08:46:48] whoops [08:49:15] ok not sure yet why stashbot and wmopbot are not coming back [08:51:36] there it goes [08:52:23] yeah I bounced it 🥲 [08:53:33] now for wmopbot [08:54:13] I'll follow what's on https://wikitech.wikimedia.org/wiki/Tool:Wmopbot [08:55:54] actually a delete pod wmopbot-769d8b7989-6s6w2 seems to have done the trick without apply [08:58:00] pod.yaml creates a deployment it seems, deleting the pod should be enough yes [08:58:21] oh, wait, wrong pod.yaml [08:58:25] it creates a pod [08:59:17] mmhh ok then from the logs I get this [08:59:19] !!! UNABLE to load uWSGI plugin: /usr/lib/uwsgi/plugins/python3_plugin.so: cannot open shared object file: No such file or directory !!! [08:59:33] jumping in a meeting [09:00:04] ah there we go [09:01:03] hmm, not sure how that works xd [09:03:38] there's a webservice (that's the pod you deleted), but pod.yaml creates a different pod (just named `wmopbot`), that currently does not exist, but it joined somehow [09:03:46] maybe the ircbot one? [09:03:54] restarted ~4m ago by itself [09:04:34] yep, there's a deployment, ircbot, that creates a pod that is the same (same command) as the pod.yaml [09:04:48] I think that wiki might be outdated? [09:25:33] oh yeah totally outdated looks like [09:29:56] well seeing the bots drop from irc as I was running cumin was enough adrenaline for this morning for sure [09:33:48] hahahaha, don't worry, stuff happens, that means you are doing things [09:34:39] haha that's very true [09:56:02] does `toolforge jobs load` intentionally no longer delete existing jobs that are not present in that yaml file? [11:32:31] andrewbogott: draft cloud-announce post https://etherpad.wikimedia.org/p/trixie-announcement [11:33:17] I also created https://phabricator.wikimedia.org/project/view/8107/ and filed tasks for things managed by us [11:52:08] taavi: T364204 for the jobs load [11:52:08] T364204: toolforge jobs load flushes out all jobs - https://phabricator.wikimedia.org/T364204 [11:52:39] (fixed as a bug >1year ago) [11:52:52] that does not seem like the same thing? [11:53:15] previously, when you removed a job entirely from the yaml file and then ran load, it would delete the no-longer-defined job. now that does not happen [11:56:07] the note here is then a bit misleading https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/blob/main/functional-tests/tools/jobs-api/dump-and-load.bats?ref_type=heads#L64 [11:56:41] if I'm reading that task correctly, the problem was that doing `toolforge jobs load` would delete and re-create jobs that did not change [11:57:29] so I'm not sure where that interpretation in that test came from [12:00:45] probably just the title, is kinda misleading, still that feature has not been there for >1year, looking at logs on the cli side [12:12:08] ah, no, the removal of flushing all the jobs was also introduced with that task https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/33 [12:12:58] I agree that it is not solving the issue described in the task [12:14:52] given that's been more than a year with the current behavior, I'm a bit weary that some users might rely on the new behavior :/, maybe we can restore with a flag? [13:09:07] taavi: that announcement looks great. We could disable creation of new Bullseye VMs in the same breath; what do you think? [13:10:07] andrewbogott: sure, can you do that? (I created T401805 to track that earlier) [13:10:08] T401805: Disable creation of new Bullseye instances - https://phabricator.wikimedia.org/T401805 [13:14:29] taavi: ok, done -- it's now available in testlabs only (and can be enabled for other projects by special request) [13:57:09] taavi: created T401830 [13:57:09] T401830: [loki] persist build logs for each tool on their loki namespace - https://phabricator.wikimedia.org/T401830 [13:57:17] thx [13:58:01] taavi: I edited the announcement email a bit, bullseye-wise [13:58:43] i can hit 'undo' a bunch if you want it back the way it was :) [13:58:45] andrewbogott: oh no, you're just a bit too late with that [13:58:55] oh, ok! nevermind then [13:59:10] sorry! [13:59:34] no worries -- if anyone is desperate for Bullseye I'm sure they'll reach out [17:45:56] * dcaro off