[06:07:02] Hi, I'm having trouble SSHing into Toolforge. My username is marchellangelina. After entering my passphrase, the connection just hangs forever and never opens a shell. I suspect my home directory might have hit disk quota during an earlier session where I accidentally ran npm install on the login node. Can someone help me check or reset my quota? [06:07:03] Thank you. [06:22:14] ponyooo_: I see login sessions on the bastion for that user successfully opened, then one session timedout and another one got a disconnect from the client around 5:59 UTC. Have you checked for network issues on your side? [06:30:03] could add some -v to your ssh [06:48:43] Thank you for checking! I've tried both my regular network and mobile hotspot but the issue persists on both. I apologize for the trouble. I accidentally ran npm install and nvm install on the login node during an earlier session, which I now know I shouldn't have done. Could that have left broken startup files or corrupted NVM config in .bashrc [06:48:44] that's causing the hang? [06:48:44] I also tried connecting with -v and the last output I see is "ENABLE_VIRTUAL_TERMINAL_PROCESSING is supported. Console supports the ansi parsing." After that it just hangs. Authentication succeeds fine, so I think the hang is happening during shell startup, possibly due to broken NVM config in .bashrc from the earlier nvm install attempt. [06:51:16] ponyooo_: checking [06:51:31] force a commandline as ssh parameter like a bash invocation set to ignore dotfiles? [06:52:32] I see an sshd-session stuck at ruby -ryaml -e puts YAML.load(STDIN.read)["configuration_version"] [06:52:36] but then also have to tell ssh to allocate a tty [06:52:38] but that's part of /etc/update-motd.d/97-last-puppet-run and that works fine [06:54:04] ponyooo_: I've copied .bashrc to .bashrc.safe and removed the last 3 lines related to NPM to it, can yuou retry? [06:56:13] One more thing, I also manually added the NVM path to my .bashrc during that session, which might be the specific line causing the hang. [07:01:44] that's what I've removed [07:08:53] Thank you so much for your help, volans and jeremyb! I tried to log in again, and it is still hanging even after trying ssh -t marchellangelina@login.toolforge.org /bin/bash --norc --noprofile. The .bashrc fix didn't seem to resolve it either. Could the stuck ruby process you spotted earlier still be blocking the session? [07:23:43] ponyooo_: those processes are not there anymore, I see an npm install stil lrunning for tools.kiwicollabs-staging, can I kill it? [07:25:56] Yes, please feel free to kill that nvm/npm install process. Since I am planning to start fresh, that process is not needed anymore. Thank you for your help in cleaning it up! [07:26:55] try again to login [07:34:40] I was able to log in successfully! Thank you so much volans and jeremyb, I really appreciate the patience and help! [07:35:29] no prob, glad we solved [13:14:12] I'm working on changes to cirrus (OpenSearch) in deployment-prep (ref T425585 ), what is the best way to inform stakeholders? I was thinking of hitting the cloud-announce mailing list [13:14:12] T425585: Write lightweight OCI-image-based Puppet plans for beta cluster - https://phabricator.wikimedia.org/T425585 [13:21:27] that might be overkill as is for all cloud users that don't have access to deployment-prep [13:21:54] there is a #beta-cluster on slack [13:22:00] not sure how much used [13:23:37] there is also https://lists.wikimedia.org/hyperkitty/list/betacluster-alerts@lists.wikimedia.org/ not sure how many members though [13:23:42] inflatador: ^^6 [13:24:27] volans ACK, thanks for the suggestions [13:24:50] I don't know if there is any better/canonical venue [13:29:32] !log tools.dimastbkbot stopped the tool again, as it's causing issues on toolsdb T428139 [13:29:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.dimastbkbot/SAL [13:29:36] T428139: [toolsdb] Transaction History Length growing too much - https://phabricator.wikimedia.org/T428139 [13:29:41] inflatador: ping Growth maybe if you plan to bring the cluster down for some long period? for context they filed T427196 which I believe suggest they're using it for testing [13:29:42] T427196: [beta-cluster] Fetching task suggestions failed: cirrussearch-backend-error - https://phabricator.wikimedia.org/T427196 [13:36:43] dcausse ACK, thanks. I am replacing the cluster with a standalone VM running our docker image, ref above ticket. Anyone else besides Growth I should talk to before changing this? [13:38:23] inflatador: nop, I'm not really aware of anyone else using it regularly [14:00:42] !log admin clearing cloudback-original snapshots of cinder volumes created in 2026 to allow the backups to not fail for T428995 [14:00:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:00:50] T428995: wmcs-backup fails to create differential when the data changed too much and fails the whole run - https://phabricator.wikimedia.org/T428995