[05:54:05] tools-static.wmflabs.org appears unresponsive. anyone available to give it a nudge? [06:01:24] I agree [06:05:05] proxy5 says > Error [06:05:06] This web service cannot be reached. Please contact a maintainer of this project. [06:09:24] yeah seems like it's been down for around 8 hours now, according to both my own checks and a firing alert in #wikimedia-cloud-feed at 22:51:50 UTC [07:35:16] i'm not supposed to be here today but I'll have a look [07:35:59] lovely, nginx processes in D [07:37:42] chlod: it's back, sorry about that [07:37:52] fixed, it has been indeed [07:37:54] thanks, taavi! [07:38:22] !log tools reboot tools-static-15 to unstuck NFS things [07:38:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:13:07] o/, question: I'm unable to log in to newly created instances, it's not accepting my ssh key. Existing instances work fine. I have the correct key in idm.w.o . The instance in question is rn-hcptchprxy-puppet-01.appservers.eqiad1.wikimedia.cloud [14:13:33] I've noticed some failures in the cloud-init log, but I'm not sure it's related. [14:13:45] Is this a known issue? Or am I doing something wrong? How do I debug this? [14:26:25] Raine: /me not really with a lot of time, but just had a look, I was able to console to it, it seems that VM does not even have puppet [14:27:01] how would that happen? [14:27:30] so maybe the cloud-init errors are related, it might have failed to set it up [14:27:33] (I created it via pontoon) [14:27:35] yeah [14:27:41] but I tried twice [14:30:06] is it possible the image is bad? or is there a network issue? I'm not sure where puppet is supposed to come from [14:30:50] hmm, I think andrewbogot.t might have been working on updating the images, but have not yet finished [14:32:08] looking, the image you used is debian 12 bookworm [14:32:16] (just saying, so I don't forged xt) [14:32:30] thank you <3 [14:34:19] (where are the cloud image sources anyway? sorry, I'm clueless about cloud '^^) [14:34:44] that one (debian 12) was updaated today, so probably that's it xd [14:35:15] andrewbogott: you around? ^ [14:35:39] lucky me :D [14:35:50] dcaro: yes, I've been fighting that since yesterday morning. Fix is https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/commit/0cc23ba5119bc762806061f9ebdc3fde [14:35:56] Raine: the images are manually generated by us, from the upstream debian cloud ones -> https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/VM_images [14:36:08] oh, okay [14:36:10] But Raine it'll take an hour or so of rebuilds after that before you can build again. Sorry for the breakage! [14:36:37] thanks andrewbogott ! I'll fade away on my PTO again then :) [14:36:45] thank you both <3 [14:36:48] * andrewbogott waves [14:37:20] sounds like I should take a 1h break then :D [14:40:30] andrewbogott: just so I understand correctly, are you running a build that will finish later today? or do I need to do something and/or wait longer? [14:41:05] I will replace the broken base image with a fixed one, and then you will need to recreate your VMs. [14:41:20] The new base image will have the same name as the old one so your workflow will be the same. [14:42:10] Raine: were you building for bookworm or bullseye? [14:42:20] bookworm [14:42:49] ok, I'll start with that then [14:42:56] thank you <3 much appreciated [15:18:43] Raine: you should be able to rebuild bookworm VMs now. Please delete the old ones, and let me know how it goes with the new ones. [15:18:51] it'll be a bit longer for bullseye [15:27:13] thank you andrewbogott, on it <3 [15:33:25] for any Toolforge admins that are around: T395693 [15:33:25] T395693: purge-dup-args-29141613-j5rrs and purge-script-errors-ns0-29142064-gz5lz stuck in Terminating state - https://phabricator.wikimedia.org/T395693 [15:36:14] andrewbogott: works now \o/ [15:36:59] nice, thanks for your patience [15:37:24] JJMC89: I'm draining a couple of worker nodes right now, that might kick it. [15:41:14] JJMC89: better now? [15:41:36] no, still stuck [15:41:49] hm [15:49:42] andrewbogott: all good now [15:50:02] ok, great! Sorry for the mid attention, doing other things :) [16:09:57] /a/ac [22:55:32] [[Help:Exposing IPv6 services]] notes that HTTP(S) services should not be exposed directly. But can I do so if I want to expose something that requires TLS authentication? I am going merge two etcd clusters, of which one is running on Cloud VPS and another is running on our own servers. Or should we setup a VPN? [23:17:38] xtex: The main thing we would like to avoid is folks using the public IPv6 to setup HTTP access for humans. It might be worthwhile for you to open a Phabricator task to discuss your use case with the WMCS team.