[02:04:49] !log anticomposite@tools-sgebastion-10 tools.stewardbots SULWatcher/manage.sh restart # SULWatchers disconnected [02:04:52] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [09:08:34] !log taavi@tools-bastion-12 tools.wikibugs toolforge jobs restart irc [09:08:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [09:38:04] !log tools pushed docker-registry.tools.wmflabs.org/cloud-cicd-py311bookworm-tox:latest and docker-registry.tools.wmflabs.org/cloud-cicd-debian-builder-bookworm:2024-03-24.1 (T360405) [09:38:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:38:10] T360405: [cicd,infra] Add python 3.11/bookworm support - https://phabricator.wikimedia.org/T360405 [11:19:45] !log tools point dev.toolforge.org to tools-bastion-12 T314665 [11:19:50] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [11:19:50] T314665: Toolforge: Introduce grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665 [12:13:22] Hi, is it too soon to ask about missing software in tools-bastion-12? [12:16:20] taavi: ^ [12:16:53] please follow the instructions in my cloud-announce email and file a task [12:49:53] !log taavi@tools-bastion-12 tools.wikibugs toolforge jobs restart irc [12:49:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [14:36:42] !log h2o@tools-sgebastion-10 tools.stewardbots ./stewardbots/StewardBot/manage.sh restart # RC reader not reading RC [14:36:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [15:39:36] taavi: I hadn't noticed that you were restarting the wikibug irc component every day. Thanks and ugh I guess. I'm trying to figure out fixes. [15:41:27] !log anticomposite@tools-sgebastion-10 tools.stewardbots SULWatcher/manage.sh restart # SULWatchers disconnected [15:41:30] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [15:46:16] bd808: at least at this point it's only the irc component that needs restarts.. [15:47:58] long and boring thread of me watching the test instance fail in mostly the same way every night at https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/15#note_73550 [15:48:48] !log bd808@tools-sgebastion-10 tools.wikibugs-testing Update and restart irc task to pick up 975a44a in MR!15 [15:48:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs-testing/SAL [17:23:49] Based on a quick code search − I found Editgroups (https://github.com/Wikidata/editgroups) and Video2Commons (https://github.com/toolforge/video2commons) (re @sohom_datta: bd808: Do you know of any tool on that is using a celery backend ?) [17:32:01] o/ checking if there's a known issue with PAWS right now? as one example, downloads of this file which does exist just return a "failed" status: https://public-paws.wmcloud.org/User:Pablo%20(WMF)/outreachy/round28/features_scores_climatechange_2022.csv.zip Maybe broader than PAWS though because i've seen other cloud vps services recently that are giving a blank screen and then only work after refreshing? [17:37:08] hmm, '(Error decoding the received TLS packet.)' (from wgetting that file) smells like a proxy issue. let me have a look [17:37:53] isaacj: try now? [18:03:54] working now - thanks! [18:42:16] Is paws-public still the nginx config + python app from way back then? [18:42:48] a wild yuvi appears [18:43:12] Hello reedy :) [18:53:35] yuvipanda: mostly the same. Loads up differently [19:22:59] taavi or anyone else: sorry to bother you again, but would it be correct and granted to ask in Phab for the PHP CLI? I ask because for the Kubernetes nodes I was told that it would be very difficult [19:23:18] (talking about tools-bastion-12 packages) [19:31:05] (Well, it was not the PHP CLI that time, but it was a similar situation with packages which I used daily) [19:35:44] And if PHP CLI is not in the plan and neither will be for the new login bastion, I really want to know, because that would be a major problem [19:47:38] jem: I'm not sure I understand your question. You feel that you need php installed on the bastion? [19:48:16] I don't understand the "I ask because for the Kubernetes nodes I was told that it would be very difficult" part I guess [19:50:00] We have various PHP versions available inside containers. You can have an interactive shell with PHP via `webservice php8.2 shell` [19:52:46] Any PHP version we put on a bastion is going to be pinned to the PHP available on the Debian version the bastion is running. This is likely to cause more confusion than we would like as people wish for newer PHP versions. [19:53:00] Hi and thanks, bd808 [19:53:48] Yes, I have php code that runs on the shell, specially for wrapping the sql queries to the replicas and as a backup for my Wikipedia bots [19:55:00] In the migration from Grid I used tools like pdflatex, convert... but I was told they won't be installed if there isn't a common and clear need [19:55:13] they will not be installed. [19:55:37] we do not want people to use the bastions to run things like image transforms [19:55:42] I understand [19:56:04] And that's why I migrated that to external hosts [19:56:38] The point is that I don't know if we are in a similar situation or not [19:56:49] You can make yourself a custom buildpack based image that you run on the Kubernetes cluster to do pretty much anything [19:56:57] Yes, I was told taht [19:56:59] that* [19:57:11] But that would require to migrate my code to git [19:57:19] (As I was told) [19:57:38] And that's a major step for me currently [19:58:05] eh, maybe. You can mount a tool's $HOME into a container. It's not what David would mostly like folks to do, but it is possible. [19:59:12] Ok, anyway I don't want to spend anyone's time [19:59:14] Using version control is very much what we recommend for all tools, but today it is not a completely hard requirement [19:59:28] I understand [19:59:58] And probably I will reconsider it in the future [20:01:00] ... a future with more free time that seems more difficult each day... but well... [20:01:06] I think the main reason to use version control is so that your tools can outlive your desire to take care of them. Tools are much easier to adopt or fork is there is code in version control. [20:01:27] Yes [20:02:09] In fact I tried partially years ago and it wasn't a good experience [20:02:54] I had to spend time and then people doesn't compromise for more than a short time [20:02:57] I understand. There is a learning curve. Git especially can be confusing as there are so many different ways to use it [20:03:08] Yes [20:04:51] We really do not want to push people off of the Toolforge platform. I would love to better understand what feels too hard or too strange about the changes we are making. I can't promise we can find workarounds for everything, but we can look. [20:05:15] Thanks, bd808 [20:06:01] I understand it is my fault to be somehow... "old fashioned" [20:06:42] I can try to learn new things if it will be a "good investment", but it's hard [20:06:57] as a 50-something year old human being, I think I can understand [20:07:18] And I am very close to that number :) [20:09:09] Anyway, back to the PHP question... [20:09:43] I think I don't fully understand your previous suggestions [20:10:25] "webservice php8.2 shell" will give me a php shell... from inside any bastion? [20:11:32] sort of. That command will start a container on Kubernetes that has php 8.2 installed and connect your terminal session to a bash session running inside of that container. [20:11:56] Then you can type `php ...` and it will run that command inside the container [20:12:39] by default the $HOME of your tool will be mounted inside the container so you can read and write files there [20:13:01] Hmmmmm [20:13:09] I think I get error messages [20:13:22] jem@tools-sgebastion-11:~$ webservice php8.2 shell [20:13:22] Traceback (most recent call last): [20:13:22] File "/usr/local/bin/webservice", line 33, in [20:13:27] sys.exit(load_entry_point('toolforge-webservice==0.103.4', 'console_scripts', 'webservice')()) ... [20:13:44] And four lines more [20:14:01] hmmm... have you already `become` some tool? [20:14:09] Ah, sorry [20:14:52] Well, it took like 10 seconds but it's there [20:15:30] it takes a few seconds for Kubernetes to find a place to run your container and then start it up [20:16:41] * bd808 sees the ugly stack trace from running `webservice` as a normal user [20:16:53] we should make that less cryptic for sure [20:18:56] Ok, but this container is created and destroyed on the fly, if I understand correctly [20:19:07] yes, that is correct [20:19:31] So it can't be used to make remote calls for executing the code... which is what I do [20:19:46] remote calls from where? [20:20:11] From the Wikimedia Spain server, which is where I launch my bot and queries [20:21:07] Until now I run ssh commands to login.toolforge and dev.toolforge with no problems [20:22:05] I think I got lost again. Can you explain your old workflow so I can try to understand what is not possible for you now? [20:23:54] Ok, bd808, thanks again, I'll try to do my best to explain :) [20:25:03] My bot, Jembot, is written in PHP and makes several tasks mainly in eswiki and other wikis [20:25:47] Three of them make use of the database replicas [20:27:28] this makes sense so far :) [20:28:00] The code and the crontab which launchs the tasks is on the Wikimedia Spain server, for several reasons: reduce impact to other users (there have been occasional problems with stuck processes, etc.), easy of use, specially after the Kubernetes migration... [20:28:34] And easy access to the server administrator (hi, Platonides) [20:29:23] Anyway I make an automatic copy to Toolforge so in case of downtime I can run manual emergency tasks from inside Toolforge [20:30:13] Also it's a question of "Don't put all the eggs in the same basket" (or a similar phrase in English) :) [20:31:05] My web tools are in Toolforge and work well (apart from the recent migration problem... but it's over now) [20:31:53] ok, so at this point you would have php bot code on Toolforge and you would like to run it, but that code also needs ... something ... that you cannot do? [20:32:36] What I cannot do is launch the php code that reads the data from the replicas [20:32:55] It's used in three bot modules in total [20:34:06] As it's launched from crontab, and not interactively, I can't migrate the ssh lines [20:35:17] Am I understanding correctly that you normally run an ssh session started by cron on an external server to run php code on the Toolforge bastions? [20:35:40] More or less, that's it [20:35:57] and that normal process did not start a grid job? [20:36:22] Sorry, I don't understand [20:37:16] when the grid engine was still here, did your external cron command ssh into the bastion and then `jsub ...` to run the php code? Or did it run the php code directly on the bastion? [20:37:32] Directly [20:37:55] I really never learned about the jsub and etc. [20:38:08] (For the reasons we have discussed above) :) [20:38:15] ok. well... you were doing a thing that the Toolforge Terms of Use has always forbidden. [20:38:31] Oh [20:38:38] I'm very sorry then [20:38:51] I don't think that means we can't find you something that will work, but it will not be exactly the same [20:39:03] I don’t understand “the php code that reads the data from the replicas” yet… what did that php code do with the data? put it somewhere in the file system? send it somewhere else? [20:39:39] I'll try to explain [20:41:00] The ssh calls a small php script which calls the database read function [20:41:44] That function uses simple mysqli lines to make the query and return the results [20:42:53] And those results were printed and captured from the original call, because the ssh is called with shell_exec [20:43:14] back on the wikimedia spain server? [20:43:26] There are two modes, a simple one with just one column in plain text, and a complex one with JSON for multiple columns [20:43:42] Yes [20:44:37] And all of this runtime complexity and fragility is because you never wanted to learn to use grid engine or now Kubernetes I guess? [20:44:39] I guess I could make the query without PHP, with the mysql command line... [20:45:26] bd808: Well, in part "never wanted" and in part "never had the time" [20:45:48] you had time to invent an elaborate workaround ;) [20:46:13] All things doesn't seem elaborate at first :) [20:46:23] But yes, you are right and I'm sorry [20:47:01] I am not trying to shame you. I'm sorry if I am making you feel bad. [20:47:16] No problem, bd808 [20:48:09] The only point is that I didn't know I was breaking the Terms of Use... I suspected I was near the line, but inside the good side [20:48:34] I do think you could spend a few hours learning a bit about how we hope people will use Toolforge and solve your general problem, but I understand that is time you could also use to do other things. [20:48:47] Anyway, I'll try to do things correctly from now on [20:49:27] bd808: If it's a good "time investment"... [20:50:44] Running PHP bots is a thing that lots and lots of folks do on Toolforge. The Kubernetes system takes care of "fairness" problems by putting your code in a cpu and ram quota controlled environment. [20:51:29] And that's why I use the Wikimedia Spain server whenever it's possible [20:51:31] If you had problems before it was at least in part because you were not using the systems we have to spread out that load [20:51:52] Yes [20:53:10] If you can write a shell script that runs your bot we should be able to help you learn to run that shell script as a "job" on the Kubernetes cluster. This really should remove the need to use the external eswiki servers and jsut make things run on Toolforge. [20:54:00] Ok then [20:54:06] We even have some folks who are native Spanish speakers in case there have been technical English barriers in the past. [20:54:24] That would be of help, of course [20:54:55] I think it's not a real problem except when I have to think how to write in English :) [20:55:50] I guess for those three modules which require the database, that would be good [20:56:23] For the rest I will think later [20:57:00] The script already exists, so it should be easy [20:57:52] Anyway, my journey is ending, and I should be going home in a few minutes [20:58:51] Should I ask again tomorrow? Or maybe when there are Spanish speakers available? [20:59:21] I would like to continue this discussion in a way that we don't have to be talking in real time. Should we move to email, a wiki page, or Phabricator? [21:00:03] As you prefer, bd808 [21:00:48] The WMCS folks who speak Spanish are in Europe, so they are usually available in EU working hours. This week at least some of them are traveling to KubeCon and won't be around much. [21:01:04] Ok [21:01:31] Aanyway, I can wait... as long as bastion-11 isn't migrated [21:01:43] Or login.toolforge [21:01:58] (As I assume it won't have php either) [21:02:25] jem: how about we try to use a wiki page in your User space on wikitech? If you can write up what you do on the eswiki sever you have access to then I can read it and try to describe an equivalent process on Toolforge. [21:03:58] Ok, bd808 [21:04:34] A wiki page will be easier for us both to share with others to find more help as needed. And maybe when we have things figured out it can be turned into a tutorial for others. :) [21:04:54] Ok :) [21:05:35] But please give me 2-3 days, today was a hard day and a lot of tasks are waiting [21:06:22] I'll ping you here, of course [21:06:23] jem: no worries. You are the volunteer here, so I will have patience. I do want to help keep your bot running, but that should not cause you stress. [21:06:44] Yes and thank you again for that [21:10:08] I'm leaving now, I'll write again soon [21:10:19] (Anyway I keep connected, as always) [21:10:59] Afdstats was running on the k8s php7.4 where it was serving webpages with lighttpd. The main work was being done with a python script run via mod-cgi, but that stopped working in the last couple of days (presumably the python executable was somehow tied to the grid engine shutdown). Is there an easy way to get it back up and running without having to rewrite the python script? [21:14:50] if the webpages were static, using some cgi-capable python web server (and serving everything from a python container) might be an alternative to rewriting the python script? [21:14:58] (but if the webpages actually used PHP, it gets trickier) [21:16:06] we haven't changed anything about the php7.4 container recently as far as I know. [21:16:11] The non-static files didn't use PHP, they used python via CGI [21:16:35] if I read this correctly, uWSGI can serve both static files and cgi scripts (but I’ve never used it this way) https://uwsgi-docs.readthedocs.io/en/latest/CGI.html#example-9-using-the-uwsgi-http-router-and-the-check-static-option [21:17:02] ...... I'm fairly sure this was me forgetting to add the backwards-compat-python3 in https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/toollabs-images/+/5cbb112f4b2f6c674214ea8981f96fc627f61ffb%5E!/ [21:18:28] oh, yeah that could have done it. There was a related change the lighttpd config from that too right taavi? [21:18:32] * bd808 forgot about this [21:20:51] yeah, https://gerrit.wikimedia.org/r/plugins/gitiles/operations/docker-images/toollabs-images/+/refs/heads/master/shared/lighttpd/webservice-runner doesn't have the prior python cgi stuff either. [21:21:14] bd808: https://gerrit.wikimedia.org/r/c/operations/docker-images/toollabs-images/+/1012753 [21:21:32] python3 was previously pulled in to the image via webservice-runner. so rewriting it in bash removes it [21:22:02] (fun fact: the first ever user-facing issue on Toolforge caused by me was the exact same thing but from a python2->3 update) [21:22:43] So at this point I should just wait for /usr/bin/python to be restored? [21:24:54] @Ahecht: probably yes, unless you want to jump into working on a build service replacement for yourself. The fix that taavi is working on will stop working at some point in the future as we add newer php versions without the backwards compat hack. [21:26:46] The build pack stuff is above my head right now, so I'll just leave that as a problem for the future. [21:27:31] Ahecht Is there a repo for afdstats ? [21:28:06] (I can try to see if buildpacks will work if I get some free time) [21:28:21] https://gitlab.wikimedia.org/toolforge-repos/afdstats [21:28:53] !log tools kick off full container image rebuild for https://gerrit.wikimedia.org/r/1012753 (python3 backwards compat in lighttpd images) and https://gerrit.wikimedia.org/r/1010690 (add procps to base images) [21:28:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [21:32:01] @Ahecht: that looks like a project that should be fairly straight forward to port from python cgi to python wsgi. I understand that is a potential learning curve, but it might make your future self happier to have something that is a closer fit to "standard" Toolforge patterns. [21:32:33] Maybe you can nerd snipe someone like @sohom_datta into doing the initial port for you. ;) [21:37:11] Yeah , it's on my to-do list for when I have more free time (although any help would be appreciated) [21:45:42] I'll try and get to it sometimes this week hopefully, it seems easy (imo) :) [22:05:15] @Ahecht: your tool should now work again if you do a `webservice restart`. [22:09:24] !log taavi@tools-bastion-12 tools.afdstats toolforge webservice restart [22:09:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.afdstats/SAL [22:11:54] Yup, thanks for the quick response! [23:54:59] !log bd808@tools-sgebastion-10 tools.wikibugs-testing Update and restart irc task to pick up 96800c53 in MR!15 [23:55:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs-testing/SAL