[01:52:12] I've tried using a vpn, my cell connection, connecting from a vps that I rent. And every time I keep getting the same debug3 error and it eventually times out when trying to ssh to login.toolforge.org. when passing the IPQoS=none opt it just times out without that error [01:56:53] bd808: should I file a ticket about this? I can the various commands at that link you sent [02:00:00] blu3r4d0n: yeah, go ahead and make a ticket. Hopefully one of our SREs can help figure out what’s going wrong for you. [02:00:18] thanks [02:00:56] it says something about making it a security ticket so the public details aren't visible, should I do that if I don't want all the log info out in the open [02:04:00] I can't tag as toolforge on a security ticket fwiw [02:16:46] You can add the Toolforge tag after you make the task. The security ticket part is just about protecting your ip address information. We consider ip addresses to be private information. [02:19:28] understood'] [02:31:31] blu3r4d0n just for fun, try connecting to dev.toolforgre.org instead of login.toolforge.org [02:31:56] I would be astounded if you got any different result, but it's an easy experiment to try. [02:33:02] roy649: same error as login.toolforge.org [02:33:24] well, OK, at least that's a sane result. [02:33:26] in the process of writing the ticket [02:33:42] can't you just edit out your IP address and make it a public ticket? [02:33:46] Also tried specifying -F /dev/null for sanity [02:34:03] I can do that [02:34:55] If you don't mind, let me know the ticket number when you create it. I don't have much confidence that I can help, but I'll be happy to take a look and see if I can spot anything. [02:35:03] SSH problems can be a beast to debug. [02:35:06] will do [02:39:32] maybe this will help: https://serverfault.com/questions/1101269/cant-connect-ssh-via-wireless-interface-but-t-works-using-eth0 [02:39:47] it's a report of what sounds like the same problem [02:40:03] yeah, tried that with a little box I have plugged directly into my router and it didn't :/ [02:40:43] can you ssh to other places? [02:40:49] yep [02:41:19] then I'd try ssh to someplace you can get to, then from there, ssh to toolforge and see what happens [02:41:40] that was one of the first things I tried [02:41:48] ah [02:42:01] and what happened? [02:42:08] same error [02:42:18] Tried doing it on my cell network too [02:42:20] fascinating [02:43:27] what happens if you do: [02:43:29] ssh -A -t bullseye.spi-tools.eqiad1.wikimedia.cloud [02:43:36] that's a VPS instance I maintain [02:43:58] you don't have credentials to login, but you should at least be able to get to the point where authentication fails [02:44:56] after hitting enter on a blank password it fails as expected [02:45:19] but you're getting to the point where you get a login prompt? [02:45:24] yep [02:45:28] mind blown [02:45:41] going to post the ticket, give me a minute to read over it [02:52:48] roy649 https://phabricator.wikimedia.org/T357493 [03:17:56] I don't think -F /dev/null does what you think it does [03:18:01] try "-F none" [03:18:49] I don't think it changes the output fwiw [03:18:59] This is also weird. If I do: [03:19:01] ssh -F none -i /dev/null login.toolforge.org [03:19:13] I end up getting logged in! [03:19:33] It's obviously finding my credentials even though I said "-i /dev/null" [03:19:42] whoah [03:19:44] that worked [03:19:46] what [03:19:58] okay my brain is breaking a bit [03:20:14] This is your brain. This is your brain on SSH. [03:20:24] I did have the key specified for the domain in my .ssh/config [03:20:47] Try moving your entire .ssh directory to some other place [03:21:09] and see what happens [03:21:32] and specify the key file? [03:21:46] for now, try it with no key file and see what happens [03:21:52] baby steps [03:22:16] what did I miss when I ran out to the grocery store :-P [03:22:41] multiple brains getting fried, nothing out of the ordinary when debugging SSH problems. [03:23:03] roy649 so right now it's sitting on the debug3: set_sock_tos: set socket 3 IP_TOS 0x48 which is what happens before it times out typically [03:23:24] when you ran what command? [03:23:40] ssh philipnelson99@login.toolforge.com [03:23:56] with your .ssh dir moved out of the way? [03:24:01] yup [03:24:14] OK, let me see if I can repro that here. [03:24:20] okay [03:24:31] really appreciate the help [03:24:55] Today's been a shitty day. Compared to some other crap, this is fun [03:26:01] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Quickstart says login.toolforge.org [03:26:02] [03:26:04] where did you get .com from? (re @wmtelegram_bot: ssh philipnelson99@login.toolforge.com) [03:26:21] sorry I meant org [03:26:21] ugh, I didn't even notice that [03:26:25] that was just brain [03:26:32] wait [03:26:32] copy-paste is your friend [03:26:58] okay [03:27:04] so that works when you use the right tld [03:27:14] with .ssh moved [03:30:39] try this: [03:30:45] env | grep SSH [03:31:34] I'm guessing you've got SSH_AUTH_SOCK? [03:31:43] yep [03:31:59] ok, I've got one too, and that's what's been making strange things happen. [03:32:05] ah [03:32:12] You're running an ssh_agent, which caches keys [03:32:28] so even if you make your key file inaccessible, it's finding your old key [03:32:35] from the agent's cache. [03:32:36] ohhh [03:32:37] I see [03:33:13] welp I guess that means I need to debug my .ssh/config [03:33:38] so try: [03:33:39] SSH_AUTH_SOCK="" ssh -F none login.toolforge.org [03:33:47] you should get: [03:34:05] USER@login.toolforge.org: Permission denied (publickey,hostbased). [03:34:06] ...@login.toolforge.org: Permission denied (publickey,hostbased). [03:34:10] yup [03:34:20] OK, at least we've worked our way back to something that makes sense. [03:34:29] yes, thought I was going crazy [03:34:48] you can also do something like `ssh-add -l` IIRC. for more data. [03:35:14] oh that is useful [03:35:15] thanks [03:36:15] so, at this point, what I'd do is put your .ssh dir back where it's supposed to be and one step at a time try working through your config and key files, putting SSH_AUTH_SOCK="" in front of each connection attempt. [03:36:29] hopefully that'll get you making progress. [03:36:37] yep, that's what I'm about to do [03:36:40] I really appreciate the help [03:37:08] glad I could be useful [03:37:48] will close ticket [03:38:34] I would also try testing against bastion.wmcloud.org [03:38:34] you may both want to remove .com from your local `known_hosts` so if you ever did connect there again in the future you would get a warning about the unknown key. [03:38:41] gotcha [03:38:52] That's another bastion host, but it's running a different version of debian than login.toolforge.org [03:39:15] I really think I might have just been using .com this whole time [03:39:20] and I feel really stupid now [03:39:27] BTDT. [03:39:49] start from scratch, double check everything. :-) [03:40:00] jeremy gets the gold star for spotting that. [03:40:22] is it a barnstar? :-) [03:40:34] yes [03:40:38] I walked in at the right time [03:40:49] The problem with debugging SSH problems is there's so many moving parts, you really need to make sure you're starting from a known point and then work your way forward one baby step at a time. [03:41:15] indeed, I'm marking that as resolved if no one objects [03:42:47] or invalid? idk if it makes much difference [03:43:44] yeah that makes more sense [03:44:04] well now I know I'm not crazy just not observant :) [03:46:13] use bookmarks, don't type URLs manually. use keys not passwords. (and disable interactive logins) [03:47:33] See https://en.wikipedia.org/wiki/Confirmation_bias [03:48:30] Oh I typically do all of those things, I could've sworn I copy pasted the command even [03:52:33] well but wikitech is a wiki so I'm also not saying just trust the webpage. but you can dig more and e.g. check page history. which is more complicated for the documentation pages but the history of the dedicated fingerprint pages should be cleaner. e.g. [[wikitech:Help:SSH Fingerprints/login.toolforge.org]] (re @wmtelegram_bot: Oh I typically do all of those things, I [03:52:34] could've sworn I copy pasted the command even) [03:53:23] huh I couldn't remember if there was a bot in here. https://wikitech.wikimedia.org/wiki/Help:SSH_Fingerprints/login.toolforge.org [03:55:39] Oh, I didn't know about the fingerprint listing. [03:55:46] That's actually the right way to do it. [03:56:09] I landed on that page earlier today actually [03:56:18] What most people (including me) do is the first time I connect to a new machine, I just blindly tell SSH to cache whatever the hell host key it got. [03:56:35] yeah [03:56:45] TOFU [03:57:03] https://en.wikipedia.org/wiki/Trust_on_first_use [03:58:14] good acronym [04:01:12] I bet if I could manage to trick you into loading some malicious CSS, I could make those fingerprint pages display anything I wanted. [04:01:22] usually I do actually verify fingerprints. I think maybe there was a rare occasion where I didn't have anything to check against but I was able to at least check fingerprint served live by sshd from a few different Internet connections. not perfect but a little better than nothing. [04:01:53] how would you do that? (re @wmtelegram_bot: I bet if I could manage to trick you into loading some malicious CSS, I could make those fingerprint pages display anyt...) [04:02:14] could avoid CSS with e.g. ?action=raw [04:02:33] or lynx [04:02:42] wget [04:02:45] curl [04:03:26] I was saying how would you trick me [04:04:04] I'm pretty sure with CSS you can change the displayed content [04:04:34] ok but how would you get me to fetch your styles? [04:04:54] https://www.w3schools.com/cssref/pr_gen_content.php [04:05:26] that's the "trick you into loading" part. [04:05:38] Some sort of social engineering [04:07:08] I cringe when I think of how much crap I have in my common.js and/or common.css pages on enwiki that I don't really understand. [04:09:16] so lynx or ?action=raw or even load it in a new private session in your regular browser (new cookie jar so you're not logged in so your personal scripts and styles aren't loaded) [04:09:55] anyway it is worth thinking about other ways to serve them to users that aren't going to be as careful as that. [04:11:12] https://wikipedia.org/cloud/ssh [04:11:13] (I just made that up) [04:12:10] Most security stuff these days is theoretically secure, but in practice, not at all [04:12:24] Because 99.9999% of people don't understand how it works. [04:12:42] and... we could sign the keys with monkeysphere. but idk how much people would actually use that. (or are they already signed?) [04:13:11] Case in point: not long ago, I was in the supermarket and put my credit card into the chip-reader thingie. [04:13:24] In theory, it's a very secure system. [04:13:49] https://github.com/dkg/monkeysphere [04:13:49] as I understand dkg was looking for a new maintainer. [04:14:13] Except that I was busy packing my bags and the cashier got tired of waiting for me to confirm the transaction so she just reached over and hit the big green button for me. [04:14:34] I tried to explain to her that she's not supposed to do that. [04:14:58] That what she had done was basically reach into my wallet and take some money out [04:15:12] She had no clue what I was getting at. [04:16:17] I'm not sure I see the connection, most systems these days don't even make you hit that button. [04:16:44] anyway I agree that security in practice is often much worse than theory [04:17:23] I've had the same conversation with the kids at my local deli where I buy lunch. [04:17:45] They've got one of those nice ipad-based tap-to-pay systems. [04:18:16] The problem is, the only person who can see the screen is the kid and they just ask you for your debit card and tap it for you. [04:18:43] I once tried to explain to the kid that I'm not going to give them my debit card so they can tap it on a screen that I can't even see. [04:19:08] For all I know, they rang up my lunch as $5000 [04:19:30] So, again, a cryptographically secure system defeated by human factors. [04:21:54] I'll need to pay more attention to see if this is true, but maybe if you wait for the transaction amount to come up and then put your card it, it doesn't need a button press, but if you put your card in early, it does. [04:22:28] At some point, it needs to say, "I want to debit your bank account for $x, do you approve?" [04:23:23] At most supermarkets I've seen, it prompts you to dip/swipe/tap/whatever your card before the order is done being rung up, so it doesn't have a final amount yet. [04:26:41] What really drives me nuts is why they have to ask you "debit or credit?" [04:27:14] They've got all that fancy crypto tech in the card, and the machine can't tell if it's a credit card or a debit card? [04:27:36] and why does it matter? [04:29:48] well, enough ranting for tonight, I'm out [06:50:42] [this is based on US experience] if it actually is a credit card then I think it should be able to figure it out on its own and DTRT. but some systems aren't that smart. if it's a debit card then AIUI it can be run on a debit network (Pulse, NYCE, etc. usually this means you enter a PIN) or a credit network (Visa or MasterCard or Discover. usually a signature or ZIP code or some [06:50:43] places don't require signature under a certain dollar threshold.) [06:50:48] some gas stations have different prices for cash and credit. many of those stations have a notice posted saying debit charged at cash price or debit at credit price. but that's only if you run it as debit. if you enter a zip code or hit the credit button then you'll pay credit price. [08:16:40] !log admin reboot clouddumps1001 for kernel updates [08:16:44] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:31:52] !log admin failover all dumps traffic to clouddumps1001 [09:31:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [09:40:16] Hello everyone! Yesterday I requested help about migrating my bot to Kubernetes and I was suggested to open a task about it. [09:40:18] [09:40:20] I reopened https://phabricator.wikimedia.org/T320048 and (hopefully) pinged the persons that I was talking here. One of my (pair of) task fails. IIRC I've been able to make it work "manually" but the configuration on my YAML file for it should not be correct somehow. It's a task that uses venv and a bootstrap script so maybe it has to do with that. Can someone check for me where [09:40:22] exactly I'm failing and guide me on what to change? [10:05:14] Klein, I'll try to give it a look today :), in the meantime, if you have a reproducer I can try that will not break anything please add it to the task [10:49:31] Thanks a lot! I think I might have solved it though. There was something missing in the command line in the YAML file for the pair of tasks that failed. I flushed and reloaded the tasks and I'm waiting now. [10:50:11] 🤞 [12:09:32] !log tools.pywikibot rebuilt image to pick up the latest procfile fixes (T355214) [12:09:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.pywikibot/SAL [12:09:35] T355214: [apt-buildpack] Not sourcing /layers/fagiani_apt/apt/.profile.d/000_apt.sh - https://phabricator.wikimedia.org/T355214 [13:24:51] I think I just got paged? (just got an SMS, but the app did not alert me or anything) [13:27:38] oh, I think it got mixed up because I had two phones at some point (my phone got borked, so I have a new one) [14:09:21] !log admin creating some missing $PROJECT.wmcloud.org. DNS zones [14:09:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:11:25] does anyone know why wikibugs seemingly stopped posting gerrit events to IRC? (at least in #wikimedia-operations, #wikimedia-dev) [14:11:36] ah, it just quit, “remote host closed the connection” :/ [14:11:38] !log samtar@tools-sgebastion-10 tools.wikibugs Restarted all jobs [14:11:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [14:12:48] looks like that helped [16:09:09] Hey folks, I'm looking for best practices on integrating Toolforge Build Service (https://wikitech.wikimedia.org/wiki/Help:Toolforge/Build_Service) with GitLab CI/CD runners. Any recommendations? [16:15:36] erut: We are literally working on figuring that out at the moment. I don't think we have docs / or recommendations other than "stay tuned" [16:16:22] Right :D any repos that might be instructive? [16:16:53] actually, it may be the other way around. We may be interested in understanding what do you want to do [16:18:35] some links: [16:18:36] https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Toolforge_push_to_deploy [16:18:40] https://phabricator.wikimedia.org/T194332 [16:18:46] https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Toolforge_Buildpack_Implementation [16:19:22] Set up a CI/CD pipeline on GitLab for https://gitlab.wikimedia.org/repos/future-audiences/citation-needed-api that builds and deploys on code change or after manual confirmation to Toolforge. Env variables/secrets would be involved, and I'd be really nice to see build or deployment failures on GitLab [16:20:51] erut: that sounds basically like out "push to deploy" desire [16:21:12] yep, agreed, matches your first link pretty well [16:30:09] !log tools reboot tools-sgeexec-10-23 due to high load [16:30:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:35:26] !log tools kill jobs user 'wikishizhao' is running directly on the grid per https://wikitech.wikimedia.org/wiki/Help:Toolforge/Rules #3 [16:35:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [18:22:22] arturo: hello from Berkeley (!) [18:23:01] I read your enthusiastic blog post on Planet Debian just now.. I have been silent here (for $REASONS) but I thought to say hi [18:24:01] I did visit the Wikimedia site in SF and also an annual in January one year at the SF Presidio.. I was interested in the "new" cloud setup for WMF at the time, viewed from the chair of OSGeo Linux (ten years+) [18:24:50] after making a Labs account, I did not keep up with all the changes.. part of this msg is a confession that way.. BUT your visible enthusiasm is contagious, so here it is [18:25:57] I am certain that no amount of IRC chat can catch up to all the cloudy-ways now implemented there at WMF. I did some time at Berkeley Institute of Data Science and met a few of the brilliant new minds there, who contribute to some of the automation and such at WMF cloud [18:27:13] however, to close - Thank You for whatever you are up to.. and second, OSGeoLive Linux is standards, network ready and Notebook firiendly environment with a long standing audience.. OSGeo is a natural ally of WMF and also the Openstreetmap project, who uses a lot of the toolchain published with OSGeo-affiliated teams [18:28:09] --Brian Hamlin, #osgeolive [18:40:58] !log tools.abstract-wiki-ds mark as disabled per T319479 [18:41:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.abstract-wiki-ds/SAL [18:41:02] T319479: Migrate abstract-wiki-ds from Toolforge GridEngine to Toolforge Kubernetes - https://phabricator.wikimedia.org/T319479 [21:29:37] dbb: thanks. Happy to see you around here. [21:30:33] ++ [23:06:51] !log anticomposite@tools-sgebastion-10 tools.stewardbots SULWatcher/manage.sh restart # SULWatchers disconnected [23:06:54] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL