[00:02:24] > is it possible to run a python app that uses threading on toolforge? I tried and I saw a message in the logs that said threading is disabled for performance reasons? [00:20:40] a python program or a python web application? [00:20:47] python web app [00:21:15] specifically https://datasette.io/ [00:21:31] it uses a lot of async stuff and I think it needs threading to do it's thing [00:21:53] I may be wrong about threads but it's my best guess as to why it didn't want to work on toolforge [00:26:21] I'm assuming this is using grid engine. Any reason not to run it on k8s? [08:49:06] It seems we have a problem on login.wmflabs.org [08:49:07] "$ python3 get-pip.py.1 --user DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality." [08:49:55] (I'm trying to install https://github.com/dpriskorn/ItemSubjector/) [08:50:21] There should be images for 3.7 & 3.9 available [08:50:34] Use webservice shell to create a venv [08:50:57] Nothing you run anyway should be directly on the bastion [08:51:01] Oh, now you are talking greek 😅 I gotta RTFM it seems. [08:52:36] If it runs in PAWS, why not just use PAWS [08:53:08] login.wmflabs.org is not a thing [08:53:45] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Python#Virtual_environments [08:53:48] I want to run a job on the grid basically [08:55:13] Using Jsub? [08:55:16] the grid does not have anything newer than 3.5 available yet, we've been working to replace it with more modern alternatives but the replacements are not quite ready for widespread use :/ [08:56:14] yes jsub was the idea, but I read it is being replaced by k8s [08:56:22] I'm very new to toolforge and k8s [08:57:05] there is a beta feature on Toolforge for using jobs on kubernetes [08:57:28] more information here: T285944 [08:57:28] T285944: Toolforge: beta phase for the new jobs framework - https://phabricator.wikimedia.org/T285944 [08:57:36] but is in *beta* phase [08:58:00] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Kubernetes#Kubernetes_cronjobs [08:59:03] thanks ❤️ [08:59:28] it seems I have to "become" first before doing anything else [08:59:46] Yes you should become a tool [09:00:02] that worked 😃 [09:00:14] :) [09:01:00] so after becomming I still have python 3.5 [09:02:13] when running "toolforge-jobs containers" I get command not found [09:03:22] you're logged in to the wrong host, toolforge-jobs is only on the beta buster bastions which is why the instructions on the task say "$ ssh dev-buster.toolforge.org" [09:03:44] oh, ok, so I'll log in there instead :) [09:05:28] done, now it works 🥳 [09:06:45] it is python 3.7 on the dev-buster! Nice, thanks for all the help, I think I'll manage from here 😃 [09:09:59] sadly it's not that simple, the legacy grid (what `jsub` and friend use) still are on debian stretch/python 3.5, and you should not run any heavy processes on the login host directly [09:11:25] the jobs framework (described in the ticket above) lets you use newer runtimes, but it's not very user friendly yet [09:22:57] yes, I understood that. I'm trying out the k8s beta framework now. [09:23:57] running the script I linked above I get a python error though, so I guess even 3.7 is too old. I'll have to dig some more. Is there any possibility for me to update python? [09:24:56] which error? [09:25:04] what command do you use? [09:25:55] does IRC support photos? [09:26:28] the bridge bot will make things work [09:26:46] ok nice [09:27:31] "telegram says: posting media not allowed in this group" 😅 [09:27:48] it was a syntax error which is weird [09:28:27] I'm testing a workaround I committed here https://github.com/dpriskorn/ItemSubjector/commit/a3d01bf2269a715837d92cf85a464f6546cb678f [09:29:01] $ python itemsubjector.py -l Q1148337 --prepare-jobs [09:29:13] that is the command line I'm using. [09:29:25] is there any ongoing issue with the Kubernetes cluster? When I create deployments, they do not get populated (no pod is created for them) [09:29:28] it seems python 3.7 have a problem with typing :/ [09:31:00] pintoch: Have you checked `kubectl get events` yet? [09:32:39] majavah: not yet, and it does seem to mention an error indeed, thanks! [09:35:18] This is the exact error I get https://stackoverflow.com/questions/51789120/type-hints-syntax-error-on-python-3-5 [09:38:48] so the workaround worked, but I got another syntax error now, see https://gist.github.com/dpriskorn/11aee64cc5b74ee5a40579d7c405ba0f [09:39:07] the python in this machine seems broken :/ [09:41:55] `python` is python 2.7.16 on that host [09:42:12] it looks like you’re not running your commands through the jobs framework [09:42:27] (thanks again majavah that did the trick :) ) [09:42:58] pintoch: out of curiosity, what was the issue? [09:44:00] my deployment yaml file was using a syntax to specify the "home" volume that is apparently not accepted anymore (I have been using it for years without issue, I just copied it from the Wikitech docs I suppose) [09:44:41] I adapted it to follow what is now in the docs and it solved the problem [09:48:57] thanks Lucas! (re @wmtelegram_bot: `python` is python 2.7.16 on that host) [09:49:35] I'm installing the latest python via pyenv now so I hope that will fix any possible remaining issues with 3.7 [09:54:28] now it works it seems, thanks all for the help [10:08:23] I just submitted the job but it failed after 2 sec with code "2". Is there any way to inspect what went wrong? [10:10:10] DennisPriskorn: the jobs standard output and error are written to the tool's home directory [10:13:12] anyhow, as I said before, the grid will have python 3.5 and not 3.7 available [10:13:16] I ran: $ toolforge-jobs run myjob --image tf-python39 --command "python3 itemsubjector.py --run-prepared-jobs" [10:13:17] and got: python3: can't open file '/data/project/itemsubjector/itemsubjector.py': [Errno 2] No such file or directory [10:15:20] it's not smart enough to know what directory you are in when you start the job, the command is ran in the tools home directory and there is no file itemsubjector.py in there [10:17:25] ok, thanks 👌 [10:49:45] I now succeded submitting a job and it works 🎉 [13:13:54] I tried submitting another job but it gets killed right away. [13:13:55] according to kubectl get events the error is BackoffLimitExceeded [13:14:21] "Job has reached the specified backoff limit" [13:17:31] that essentially means "the job is failing to run" [13:18:25] the output (including errors, usually) is written to the tool's home directory, job_name.out and job_name.err [13:18:55] aha, I get no output from the job, so its hard to debug. [13:18:59] I got output before [13:19:29] which job? [13:20:09] job/job7 Created pod: job7-hcmq7 [13:20:23] does that help? [13:20:37] kind of, except it hasn't printed any output [13:20:43] what command did you use to start it? [13:22:30] I found the cause I think. A shell script missing chmod+x [15:39:59] * Skynet hugs legoktm [15:55:22] Hello cloud folks! I'm trying to ssh into an instance via the new bastion and getting "channel 0: open failed: administratively prohibited: open failed" [15:55:56] I think I'm getting through the bastion fine, so it's probably an issue with the instance (cn-stage-3.centralnotice-staging) [15:56:34] Maybe a security policy needs an update? I've got it allowing access to port 22 from 172.16.0.0/21 [15:57:52] ejegg: new bastion? [15:58:00] we haven't had new bastions for a while [15:58:43] arturo: oh heh, i haven't logged in to that instance since February of last year... [15:59:15] I was previously going in via bastion1.eqiad.wmflabs [15:59:25] and now am trying via bastion.wmcloud.org [16:01:03] anyway, i get the same 'administratively prohibited' via the wmflabs bastion [16:10:16] ejegg: that's likely a wrong hostname [16:10:53] majavah: for the instance, or for the bastion? [16:11:19] ah, i was sshing into cn-stage-3.centralnotice-staging.eqiad.wikimedia.cloud [16:12:00] for the instance, yes [16:12:10] try eqiad1.wikimedia.cloud [16:13:47] hmm, still failing: ssh cn-stage-3.centralnotice-staging.eqiad1.wikimedia.cloud [16:13:51] channel 0: open failed: administratively prohibited: open failed [16:16:04] ejegg: try cn-staging-3.centralnotice-staging.eqiad1.wikimedia.cloud [16:16:09] I don't see an instance with that name https://openstack-browser.toolforge.org/project/centralnotice-staging [16:16:11] (from https://openstack-browser.toolforge.org/project/centralnotice-staging) [16:17:54] oh weird, i was sure we had those cn-stage-X names in the DNS but now that I look at the zone for the project they're missing [16:18:12] hmm, now it seems to be just hanging, will try with -vv [16:18:27] gets the right IP address at least [16:21:21] oh derp, i have to update my ssh config for the -staging hostname [16:22:04] and I'm in! [16:22:08] Thanks majavah and arturo [16:22:15] 🎉 [17:14:34] !log toolsbeta testing new maintain-kubeusers release T279106 [17:14:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [17:14:39] T279106: Establish replacement for PodPresets in Toolforge Kubernetes - https://phabricator.wikimedia.org/T279106 [17:20:57] !log tools deploying new maintain-kubeusers for lack of podpresets T279106 [17:21:01] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:21:01] T279106: Establish replacement for PodPresets in Toolforge Kubernetes - https://phabricator.wikimedia.org/T279106 [18:41:06] hi again cloud savants, can you tell me if mediawiki-vagrant ought to be available via puppet on bullseye WMCS images? [18:41:15] I'm seeing this error in puppet.log: [18:41:19] (/Stage[main]/Lxc/File[/etc/lxc/default.conf]) Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/lxc/bullseye/etc-lxc-default.conf [18:41:41] then the rest fails with 'Skipping because of failed dependencies' [18:46:01] ejegg: you might be the first to try it. Do you have time to open a phab task? [18:49:13] sure andrewbogott [18:49:17] will do [19:01:47] https://phabricator.wikimedia.org/T291660 [19:03:07] no rush on that particular one - I'm just familiarizing myself with the instance admin tools etc in anticipation of setting up the fr-tech-dev project [19:03:59] the fr-tech-dev project's current use case will not need mediawiki-vagrant [19:04:30] but i thought it might be nice to get centralnotice-staging on to a more recent OS [19:09:10] it's definitely something we want to support [19:24:30] mediawiki-vagrant is functionally a dead project today [19:25:24] fixing deploying the role on bullseye should be not too hard though. that's really about provisioning Vagrant itself [21:00:44] / [22:11:27] ejegg|afk: should work now if you want to try again [23:33:23] I'm back from a few weeks inactivity, and am trying to remember how to log in to the vps for https://phabricator.wikimedia.org/T283791 . In my notes I only have "ssh myusername@ircwebchat.wmcloud.org", and that times out. What's the steps, please? [23:34:45] https://wikitech.wikimedia.org/wiki/Help:Accessing_Cloud_VPS_instances [23:35:39] check your local ssh config, do you already have config like this: https://wikitech.wikimedia.org/wiki/Help:Accessing_Cloud_VPS_instances#ProxyJump_(recommended) [23:37:40] grys: here you can see the internal name of the VM (instance): https://openstack-browser.toolforge.org/project/ircwebchat [23:37:47] so it's ircwebchat.ircwebchat.eqiad1.wikimedia.cloud [23:38:08] insert that into: $ ssh -J @bastion.wmcloud.org @..eqiad1.wikimedia.cloud [23:38:41] name of instance and name of project is identical in this case [23:38:52] that's why ircwebchat.ircwebchat [23:40:03] also it tells us there is instance-ircwebchat.ircwebchat.wmflabs.org. in DNS as well. maybe that was setup to make this easier [23:40:29] ok, thanks; it works [23:40:34] :)