[13:03:29] stw: I've restored the cursed volume to a new volume named 'app-www-1'. Can you try attaching that and confirm that attachment works and it contains the data you need? [13:04:03] sure, will take a look, thanks [13:05:09] I would like to keep the old/broken volume around for a bit for further investigation. It might be impossible to delete anyway :( [13:07:37] huh, interesting. Horizon thinks it's /dev/sdc, lsblk on the instance thinks it's /dev/sdb [13:07:57] Looks like it's mounted successfully though, and the enough of the data I need is there :) [13:08:16] I wonder if Horizon secretly thinks there's already a mount on sdb, that would fit with the weird stuck state... [13:08:44] OK, so you're able to get your service back up? That lower my blood pressure if so [13:08:59] yeah, it's back already, ty [13:09:10] Great! I opened T397517 for tracking but you don't need to do anything there. [13:09:10] T397517: Un-attachable volume in account-creation-assistance, 'app-www' - https://phabricator.wikimedia.org/T397517 [13:09:14] I'm gonna work on it separately though to try and remove that instance's dependence on a volume [13:10:10] sounds good. You're not doing anything weird currently, though -- your setup is 100% supported and I don't know why it broke :( [13:10:40] We've done some odd stuff with snapshots in the past based on a misunderstanding of how they worked [13:11:26] (I thought snapshots were completely separate from a disk like they are in AWS, but they're actually chained/overlaid on the base image) [13:12:30] yeah, it's pretty unituitive, once you snapshot then what was formerly a simple volume starts to be an incremental diff from the snap. It's efficient storage-wise but I don't love that that dependency hangs around forever. [13:12:43] Still not sure I 100% understand what it does under the hood [13:13:06] I assume app-www-1 is a brand new volume not linked to any snapshots? [13:16:04] that should be true unless something truly strange is happening [13:17:37] I guess, a second truly strange thing :( [13:18:48] I'm semi-tempted to provision a new bullseye instance and move stuff over anyway for the short term until I can finish the PHP upgrade and enable a move to bookworm/trixie. [20:34:09] !log anticomposite@tools-bastion-13 tools.stewardbots stewardbots/StewardBot/manage.sh restart # Pinged out [20:34:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [21:32:42] for the beta cluster, I notice that some wikis use SUL SSO after signup from simple.wikipedia.beta.wmcloud.org, but I forget, does en.wikipedia.beta.wmcloud.org require something special for SUL SSO to work? i'm wondering about signing in there and also granting rights with createAndPromote.php (upon T397547 ; BTW if anyone has rights to grant that deployment-prep `member` access I'd of course appreciate help there - not emergency) [21:32:43] T397547: deployment-prep member access for dr0ptp4kt - https://phabricator.wikimedia.org/T397547 [21:33:45] (i'm sure i'm using the wrong search terms to find the info about this, as i coulda swore i saw the info somewhere!) [21:42:37] (and, i see bryan granted the `member` access, missed the notification during dnd on machine - thanks!) [22:39:03] andrewbogott: I've mostly moved everything over to a new instance (accounts-appserver7) so I'm less fussed about appserver6's uptime if you want to do some messing around with it. I do want to keep it around for another week or two though (just in case there's something I've missed) [22:39:56] I did have problems updating the web proxies via horizon, but I managed to push the changes I needed through via terraform anyway. [23:09:20] One of these days I should replace the logo of codesearch with a beer-battered fish ("codsearch"). [23:10:32] https://duckduckgo.com/?q=cod+search+fishing&t=ffab&iar=images