[07:23:35] just reran the poetry/pre-commit upgrade mr creation pipeline (had failed due to trying to push to archived repos), some new mrs popped up [07:23:45] fyi [07:27:31] ack [07:28:41] is there a reason some helm values are in the `.yaml.gotmpl` format rather than just `.yaml` (builds-api, builds-builder) [07:30:40] sometimes you need variables that are generated, then you need to use a golang template, others you can go by using just yaml. The benefit of using yaml only is that it will validate it with a linter, gotemplates will not [10:09:19] quick review https://gitlab.wikimedia.org/repos/cloud/cicd/gitlab-ci/-/merge_requests/41 [10:18:54] dcaro: lgtm [10:19:09] thanks! [13:53:59] https://codesearch.wmcloud.org/_health/ wondering if this could be at all related to the move to kyverno, noticed that today code search is vroken, see the status page linked [13:59:25] apergos: Amir1 was going to investigate [13:59:33] But don't use everywhere for now [13:59:38] uh huh [13:59:42] Other options all work [14:00:09] apergos: why uh huh? [14:01:02] it just means don't use code search for now, right? so I'm acking that. [14:01:37] apergos: you can use anything but the 'everything' option [14:01:50] * RhinosF1 is not sure he's ever seen uh huh used to ack [14:01:59] well now you have [14:02:38] I could have said 'right' or 'ack' (which I mostly don't use) [14:03:12] elasticsearch was upgraded not long ago too, if it uses it it might be related also [14:03:15] (fyi) [14:03:40] well this was within the last day i.e. yesterday working, today busted, so whatever that might indicate [14:05:44] the elasticsearch upgrade was a couple days ago at least, so probably a red herring [14:20:32] apergos: should be back [14:21:18] indeed, great [14:40:14] blancadesal: the image-builder vm got out of space, did a cleanup, still rebuilding... xd [14:40:46] dcaro: no rush, thank you for trying to fix it :)) [14:41:41] anyone doing anything to cloudcontrol1007? lots of alerts just popped up [14:42:33] wasn't me [14:42:33] – shaggy [14:42:53] 🤦‍♂️ [14:43:04] (the song just started playing in my head) [14:43:20] you're welcome [14:45:11] I see a lot of nbd-related errors happening yesterday afternoon [14:45:13] https://www.irccloud.com/pastebin/VDdSAzZR/ [14:46:00] but it was not until today that it started failing the journal service [14:46:02] https://www.irccloud.com/pastebin/g9K2Ubuy/ [14:46:10] (and of course, there's no logs xd) [14:46:19] that's dmesg (kernel logs) [14:47:46] I'm tempted to restart the host [14:50:03] I'm verifying the journal files just in case [14:50:15] root@cloudcontrol1007:~# for file in /var/log/journal/e78b2323d5b2433ab9309b1675674fc1/*journal; do journalctl --verify --file $file; done [14:51:26] dcaro: I'd say +1 to the reboot [14:51:38] I'll do if nothing is found [14:51:42] (goes rather quick [14:51:43] ) [14:53:05] found nothing :/ [15:01:14] host is back up, last boot log just shows [15:01:16] Jul 04 14:33:00 cloudcontrol1007 wmf-auto-restart[3427130]: INFO: 2024-07-04 14:33:00,983 : Detected necessary restart for service systemd-journald (2567580) [15:01:24] and Jul 04 14:33:00 cloudcontrol1007 systemd-journald[2567580]: Journal stopped [15:01:27] that's it xd [15:02:06] 👍 [15:37:00] wmcs-create-image does nbd things and isn't great about cleanup in case of failure. I ran it half a dozen times yesterday so that's likely related [15:37:11] * andrewbogott goes back to being OoO [15:41:32] 👍 [16:02:06] * arturo offline [17:14:48] * dcaro off [17:14:51] cya on monday