[13:35:14] heads up I'm gonna reimage clouddb1019 to bookworm T365424 [13:35:14] T365424: Upgrade clouddb* hosts to Bookworm - https://phabricator.wikimedia.org/T365424 [13:35:33] btullis is pairing with me just to have an extra pair of eyes [13:35:50] the host has been depooled since Friday, I will do a clean mariadb shutdown before the reimage [15:00:34] reimage of clouddb1019 was successful, I restarted replication [15:00:53] puppet is failing to start wmf-pt-kill.service though [15:02:00] wmf-pt-kill@s6.service: Failed to determine user credentials: No such process [15:02:20] cc btullis [15:03:05] the file /usr/bin/wmf-pt-kill does not exist [15:03:36] maybe the package is not available for bookworm [15:04:09] Oh, I have not seen that before. Is puppet clean? [15:09:29] nope because it checks that wmf-pt-kill is running [15:09:37] I'm copying the package to bookworm in apt1002 [15:10:15] "reprepro copy bookworm-wikimedia bullseye-wikimedia wmt-pt-kill" [15:11:00] ok now it's there, re-running puppet [15:11:51] "Unable to locate package wmf-pt-kill" [15:13:10] ok "apt update" fixed it [15:13:14] Dhinus: your irc message says wmT-pt-kill [15:13:32] RhinosF1: yep, that of course failed, and I had to fix the typo :) [15:13:52] I hoped no one would notice here but you did :P [15:14:18] puppet is now clean on clouddb1019 [15:14:36] and replication is in-sync [15:14:47] I will repool clouddb1019 [15:20:42] clouddb1019 repooled, let's see if the s4 replag spikes continue to appear after the reimage (T367778) [15:20:42] T367778: [wikireplicas] frequent replag spikes in clouddb hosts - https://phabricator.wikimedia.org/T367778 [15:24:18] I see we have a different replag issue on a sanitarium (db1154). anybody knows what's the cause? [15:26:46] looks like it's running a big 'alter table' [15:27:07] it will probably complete at some point, currently lag is 2 days 5 hours [15:42:52] Great work, thanks dhinus.