[10:27:38] https://whois-referral.toolforge.org has been 504ing for a few hours again, and ST47 doesn't appear to be around, so if someone wouldn't mind restarting the pod (iirc Rook you did it last time?) that'd be grand — I think whois-referral is linked to from MoreMenu or something hence why a few people tend to notice.. [10:29:14] it seems to be failing with "Thu Jul 14 10:28:43 2022 - *** uWSGI listen queue of socket ":8000" (fd: 4) full !!! (101/100) ***", not sure why but I've seen that with tool-fourohfour recently too [10:29:52] !log tools.whois-referral restart webservice, stuck with uwsgi queue full errors [10:30:06] where's stashbot? [10:30:18] o.o [10:30:29] looks like it got lost in a netsplit [10:30:34] also taavi, seems to be a different error from last time then https://en.wikipedia.org/wiki/User_talk:ST47#Recurring_504s_on_whois-referral ? [10:31:10] TheresNoTime: "UNABLE to load uWSGI plugin" in k8s logs is expected and just cosmetic, the real errors are in uwsgi.log [10:31:21] !log tools.stashbot restart, got lost in a netsplit [10:31:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stashbot/SAL [10:31:24] ah :> [10:31:25] !log tools.whois-referral restart webservice, stuck with uwsgi queue full errors [10:31:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.whois-referral/SAL [10:32:16] that error is T256482, basically we tell uwsgi to load both python 2 and python 3 plugins but the container only has the py3 plugin in it [10:32:16] T256482: kubectl logs fails on /usr/lib/uwsgi/plugins/python_plugin.so - https://phabricator.wikimedia.org/T256482 [10:33:37] Makes sense, and that restart has resolved the issue, thank you :) [10:34:41] https://sal.toolforge.org/tools.whois-referral hm, seems to have happened before, a few months ago? [10:35:58] yeah, and I've seen it with the 404 handler tool too recently so would be really nice to figure out what's causing them [10:36:24] based on the error message, maybe the uwsgi workers are all getting stuck on something? [10:39:42] TheresNoTime: root = bd808 I believe in that case as I reported [10:40:30] I left them a message but it was archived with no reply [10:40:41] https://en.wikipedia.org/wiki/User_talk:ST47/Archive26#whois-referral [10:41:15] I think they're fairly inactive at the moment (I hope they're doing okay!) :) [10:42:12] That's the issue with single person ran tools [13:48:41] !log tools rebooting tools-sgeexec-10-2 [13:48:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:49:14] taavi: is that related to the failures with the epilog stuff? I've been seeing intermittent issues with that, but found no errors in the logs yet [13:49:53] related to the grid queue being in error state [13:49:58] yep [13:50:27] if you don't mind me asking, why the reboot? [13:51:31] in my experience that's the best way to ensure any misbehaving jobs and similar get stopped on the host [13:52:16] have you been seeing epilog errors on non-webgrid nodes? [13:54:03] hmm, now I'm not sure, as it was mixed up with space issues, let me check if I can scroll back enough xd [13:54:57] all the epilog is from webgen [13:55:18] the others are tmp dir errors [14:21:36] !log paws c6ef2b22b1ab3671d3f463fb2a0fa67aec265caa update nbclassic T312251 [14:21:39] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [14:21:39] T312251: update nbclassic - https://phabricator.wikimedia.org/T312251 [14:23:32] [19:22:16] I'm working on a toolforge tool (eranbot) and I'm having trouble figuring out how to install python requirements. [19:23:04] there's a venv that I have activated, which seems to provide `pip`. However, whenever i do a `pip install`, I get an error like "ModuleNotFoundError: No module named 'pip'" [19:23:45] ragesoss: try `python3 -m pip ...` [19:24:32] although the issue might be execution context. if the venv was build on a different OS version things won't work as hoped [19:25:11] * bd808 peeks into tools.eranbot [19:25:43] okay. ugh. i don't want to break all the rest of what's already apparently working (cron scripts for bots) but there's so much disparate stuff in here that I'm very lost about what a safe way to get the plagiabot webservice back up might be. [19:25:54] (it's been down since the Stretch gridengine went offline) [19:26:49] ragesoss: is $HOME/www/python/venv is the one you are working on? [19:26:56] yes [19:27:11] oh dear god... that's a python 2.7 venv [19:27:24] yeah... [19:28:11] ragesoss: you can try using `webservice python2 shell` as the interactive shell that you work on it from [19:28:41] we still have a crusty old docker container there [20:48:41] ragesoss: the python2 container uses Debian Jessie, so the virtualenv you created in Stretch you not work, you need to recreated it [20:49:13] I had similar issues when I migrated my python2 tools to kubernetes [20:49:39] okay. i didn't create this venv and i think it's being used for other tasks than the one i'm trying to restore, so i think i'll be better off trying to set this up on a separate tool account [21:03:20] +1 to splitting up legacy all-in-one tools into smaller chunks when working to restore them [23:42:31] bd808: I'm trying to figure out how to run a ruby (Sinatra) app via `webservice`. Do you know of any examples that work? [23:43:30] I have a minimal server that I can start on port 8000, but webservice doesn't seem to make it accessible and I'm not sure how to debug it. [23:46:14] ragesoss: https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Other_/_generic_web_servers [23:46:21] >Your script will be passed an HTTP port to bind to in an environment variable named PORT. This is the port that the Nginx proxy will forward requests for https://YOUR_TOOL.toolforge.org/ to. When using the Kubernetes backend, PORT will always be 8000. When using the Grid Engine backend, PORT will change each time the webservice start or webservice restart command is run. [23:46:23] ragesoss: oooh... let me see if I can dig up the task. Brooke and I played with this quite a while ago and got to a basic working setup [23:46:49] Reedy: yes, that's where I was (hence, port 8000) [23:47:15] it's only 8000 if k8s [23:47:17] bd808: I was hoping that tool with your name on it might be somethign like that! [23:47:20] ragesoss: not sure how helpful, but https://phabricator.wikimedia.org/T141388#6258714 [23:48:00] `bundle exec rails server -p 8000 -e production` was kind of the magic bit I think... [23:49:09] and that was 2 years ago... time flies :/ [23:49:52] what's this `service.template` business? [23:50:08] https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Webservice_templates [23:50:28] a shortcut so you don't have to use so many cli params [23:51:00] the "canonical: true" bit there is old garbage [23:52:22] and what's with `webservice ruby25 start -- $HOME/start.sh` ... in particular, the `--` in the middle? [23:53:57] >More precisely, a double dash (--) is used in most Bash built-in commands and many other commands to signify the end of command options, after which only positional arguments are accepted. [23:54:01] The `--` is a cli thing to say "everything after this is extra data and not an option" [23:54:38] Reedy is quoting from https://unix.stackexchange.com/a/11382/10171 I think :) [23:56:29] in this case the `--` is not really helping the webservice arg parser as there are no following `--something` bits [23:57:05] but generally `webservice ruby25 start ` is what that does. See also https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Other_/_generic_web_servers