[02:49:25] !log extdist setting up extdist-06 with bullseye/cinder/packaged composer (T293055) [02:49:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Extdist/SAL [02:49:30] T293055: Switch extdist to Bullseye and composer Debian package - https://phabricator.wikimedia.org/T293055 [10:49:27] !log admin deleting dbbackups-dashboard project T296992 [10:49:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [10:49:30] T296992: Remove dbbackups-dashboard project and shutdown its instances - https://phabricator.wikimedia.org/T296992 [11:08:12] hello, i created wmcz-stats-test01 yesterday (for...testing). I'm able to SSH with `ssh wmcz-stats-test01.wmcz-stats.eqiad1.wikimedia.cloud`, but apparently the installing logic failed to run puppet for the first time, which means I'm not able to sudo at least. [11:09:20] urbanecm: let me take a look [11:09:31] hmm, it does not accept my root key either [11:10:14] neither mine [11:10:22] i can ssh, but that's all :D [11:10:41] urbanecm: so if you don't have valuable data inside, I suggest you drop the VM and create a new one [11:10:50] if this happens again, we can take a deeper look [11:11:47] this is what I get when running puppet in the serial console: Error: Could not request certificate: The certificate retrieved from the master does not match the agent's private key. Did you forget to run as root? [11:14:00] majavah: arturo: i just created a new fresh testing instance in deployment-prep (`deployment-prep-urbanecm-test.deployment-prep.eqiad1.wikimedia.cloud`), and the same thing happened [11:14:09] mmm [11:15:19] ^ I can sudo in that one [11:16:11] interesting. sudo works there. I'm recreating the wmcz-stats one too, to see how it goes. [11:16:53] there's errors in puppet though 'Could not send report: SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain): [self signed certificate in certificate chain for /CN=Puppet CA: deployment-puppetmaste' [11:16:57] r03.deployment-prep.eqiad.wmflabs] [11:17:10] deployment-prep has a local puppetmaster that causes that [11:17:41] so that's expected? [11:18:38] yes, on deployment-prep you need to remove /var/lib/puppet/ssl for it to accept the local puppetmasters ca [11:18:55] ack [11:19:07] okay. created `wmcz-stats-test02.wmcz-stats.eqiad1.wikimedia.cloud` in wmcz-stats, where i originally wanted to spawn something,and same thing happened [11:19:19] I'm unable to sudo, and puppet didn't run [11:19:53] I can sudo xd [11:20:09] puppet is failing with the same "certificate does not match" [11:21:02] it's pulling sudoers from the admin project because puppet didn't run and change yet [11:22:08] the puppetmaster resolves to cloud-puppetmaster-03.cloudinfra.eqiad1.wikimedia.cloud [11:25:58] urbanecm: have you used that VM name before? [11:26:04] (maybe is a stuck cert in the puppetmaster) [11:26:07] not sure. Maybe. [11:26:57] the master says 'Not Before: Jun 14 17:17:56 2021 GMT' for that specific cert, but I don't see any other with a similar name, can you try with test03? [11:27:12] (or test01 should work too) [11:27:23] the VM itself has a not before like that [11:27:34] interesting [11:27:47] that might be when the image was built? [11:28:03] maybe [11:28:31] the puppetmaster shows june 15 as the file modification date, the vm shows it was modified today [11:31:46] I'm not sure I follow. Should I try with test03? [11:32:04] yes please, though it might not work might give us an extra hint [11:35:40] Will do [11:36:51] as a data point, I just created test-create-20211203-1.testlabs.eqiad1.wikimedia.cloud and it works [11:37:48] trying wmcz-stats-test03 [11:40:11] the before looks deferent, Dec 2 [11:41:49] wmcz-stats-test03.wmcz-stats.eqiad1.wikimedia.cloud -- finally [11:41:56] does that mean i need to keep track of used WM names? [11:42:40] we should have a nova hook to clean old names up [11:43:13] nope, there's some issue there, might be a cert that was not cleaned up properly, can you open a task and leave the non-working VM for a while for debugging? [11:43:16] maybe it's broken somehow? [11:43:33] dcaro: sure, will do soon [11:43:33] yep, or failed for that host whenever was done before [11:44:25] might have been a temporary failure leaving leftover certs, maybe would be good to add a periodical cleanup script or something to deal with those [11:44:35] (after ensuring that the error is not there anymore) [13:58:34] Hi WMCS SREs, it's that time of the year, can you take a look at unused puppet modules listed here: https://phabricator.wikimedia.org/T272559 ? [13:59:05] They are from puppetdb so hopefully less false positives [13:59:38] (and double checked against cloud VMs https://openstack-browser.toolforge.org) [15:10:37] Amir1: to the "Not checked yet" list? [15:10:48] yup [15:11:02] thanks [15:13:06] I'll add https://phabricator.wikimedia.org/T296533 to the "to check" list [15:13:22] created T297014 [15:13:23] T297014: [puppet] review unused modules - https://phabricator.wikimedia.org/T297014 [15:13:44] might get to it next week if nobody gets to it sooner [15:15:11] Thanks [15:15:12] majavah: those are the irea files that don't match the role/module name? [15:15:48] yes, hiera files for roles that do not exist in modules/role/manifests [15:16:02] (we have been using hiera keys that did not match the role path pattern :/, so we might have some of those) [15:16:06] but can be checked [15:16:18] +1 for adding them there [18:40:55] * bd808 ooo lunch [18:56:16] !log admin maintain-views and maintain-meta-p on clouddb1013-1020 [18:56:18] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL