[00:22:28] bd808: I'm going to bed now, the most urgent task is emergency-solved, and I hope the tool can be fixed by tomorrow morning (with or without pdflatex) for other tasks, I'll read what you write here [00:22:49] And thanks again for the effort [00:23:40] thanks jem. [00:24:10] And thank you, balloons :) (and interesting nick!) [03:06:20] https://croptool.toolforge.org getting 502 errors [03:08:20] stemoc: Likely related to T309821 [03:08:21] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [03:08:32] I'm about to try a hacky fix [03:08:44] lolz BOOM [03:10:37] !log tools publish tools-webservice 0.85 with hack for T309821 [03:10:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:21:45] !log tools Cleared queue error states after deploying new toolforge-webservice package (T309821) [03:21:48] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [03:21:48] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [03:23:31] I think that mostly worked! [04:43:04] hour later, still dead :P [04:52:18] !log tools revert bd808's changes to profile::toolforge::active_proxy_host [04:52:20] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [04:54:06] ^ the "which one is an active redis server" logic uses the short hostname, so using the fqdn made both hosts think they are read only for webservice updates [05:05:21] !log tools removing duplicate (there should be only one per tool) web service jobs from the grid T309821 [05:05:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [05:05:24] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [06:08:33] Hi, everyone! We're running a fairly large marathon on uzwiki. We've been relying on Fountain to keep track of our progress. This morning Fountain stopped working. Can anyone help us fix it? The person who runs the platform thinks it has something to do with cloud services. https://fountain.toolforge.org/editathons/wiki-stipendiya-marafoni [07:04:13] Hi, is anyone working in T309821 at the moment? My webtool is still down... [07:04:13] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [08:03:41] yep never had toolserver* ever be down this long.. [08:05:44] taavi maybe? [09:22:38] jem: still going to take an hour or so before I can look properly :/ I have a pretty good idea what's broken, just going to be a massive pain to fix it [09:43:38] Ok, taavi, thanks [09:44:09] I can wait "calmly" for about 3 hours more or so [10:36:29] !log tools draining each sgeweblight node one by one, and removing the jobs stuck in 'deleting' too [10:36:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:56:56] jem: hmm. I started fountain on an empty node, and now I see it spawned a mono process using multiple CPU cores.. is it doing some heavy background processing or what? [10:59:23] stemoc: croptool is back up I think. [10:59:51] \o/ praise baby brion [11:02:37] lolwut [11:02:51] i cropped one image and anotehr got saved .... [11:04:33] https://i.imgur.com/MXldI5G.png i cropped this [11:04:47] https://i.imgur.com/XqVBru3.png this was saved lol [11:18:47] apparenty someone else was cropping the same image at the same time but for the other woman in the picture and it got swapped .. weird..never seen that before [11:19:15] well not swapped, just my crop ignored [11:23:07] taavi: Thanks, it's working :), and pdflatex works (that was my original problem), although there seems to be a problem with the images but I hope I can fix it [11:24:25] !log tools.fourohfour restart all pods, fully unresponsible [11:24:28] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.fourohfour/SAL [11:30:11] Nevermind about that problem, there was an undeleted test in my code, everything ok now [12:37:26] !log tools.integraality Deploy 9e85ede (T309861) [12:37:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.integraality/SAL [12:46:50] !log tools start webservicemonitor on tools-sgecron-01 T309821 [12:46:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:46:53] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [13:17:52] !log toolsbeta publish tools-webservice 0.86 (T309821) [13:17:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [13:17:55] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [13:20:33] !log tools publish tools-webservice 0.86 (T309821) [13:20:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:25:14] !log tools Upgrading fleet to tools-webservice 0.86 (T309821) [13:25:17] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:25:17] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [13:43:30] !log tools.glamtools Added -once flag to crontab entries for "next" and "just_added" tasks [13:43:32] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.glamtools/SAL [13:58:38] !log tools.glamtools https://bitbucket.org/magnusmanske/glamtools/issues/91/fyi-once-flag-added-to-crontab-entries-for [13:58:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.glamtools/SAL [14:02:50] !log tools.wikicontrib Moved webservice from buster grid engine to kubernetes [14:02:51] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikicontrib/SAL [15:49:40] !log tools temp add 1.0G swap to sgeweblight hosts t309821 [15:49:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:50:14] !log tools fix fix g3.cores4.ram8.disk20.swap24.ephem20 flavor to include swap. Convert to fix g3.cores4.ram8.disk20.swap8.ephem20 flavor t309821 [15:50:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:50:41] wow, lowercase t doesn't work :-( [15:50:47] !log tools temp add 1.0G swap to sgeweblight hosts T309821 [15:50:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:50:49] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [15:50:53] !log tools fix fix g3.cores4.ram8.disk20.swap24.ephem20 flavor to include swap. Convert to fix g3.cores4.ram8.disk20.swap8.ephem20 flavor T309821 [15:50:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [16:27:02] !log tools.fountain Moved back to strech grid enginein attempt to get things working while an active editation is happening [16:27:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.fountain/SAL [16:32:13] @ataevnodir: I got the fountain tool running again by moving it back to the --release=stretch nodes on the grid engine (probably nonsense words to you if you are not a tool maintainer in Toolforge). Le Loy will need to take a look at some point and move it back to the --release=buster nodes, but it should stay up for now. [16:59:09] !log tools building a bunch of new lighttpd nodes (beginning with tools-sgeweblight-10-12) using a flavor with more swap space [16:59:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [17:24:53] !log tools.sge-status Temporarily moving back to stretch [17:24:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sge-status/SAL [18:24:09] My webservicr jembot is stuck again :( [18:24:19] webservice* [18:25:05] I try to stop and restart but keeps saying that is already up, and I get 502 [18:25:23] Any help? [18:29:05] !help My webservice jembot is stuck again [18:29:05] If you don't get a response in 15-30 minutes, please create a phabricator task -- https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=wmcs-kanban [18:29:28] jem, sadly T309821 is still on-going [18:29:28] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [18:29:37] jem: Best to just hold tight for a bit, we're still working on things [18:32:35] Thanks, balloons and andrewbogott [18:33:41] Please ping me if there are news [18:39:09] I'm guessing this is why https://sigma.toolforge.org/ is down? [18:59:14] roy649, yes [18:59:59] !log tools depooled old nodes, bringing entirely new grid of nodes online T309821 [19:00:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:00:02] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [19:08:08] New nodes are coming online now [19:08:31] But I would caution the underlying issue remains [19:08:45] roy649, is part of your tool on k8s? [19:14:40] Jembot is back for now :) [19:16:54] 👍 [19:17:12] Hi, I'm trying to debug why a tool on toolforge can no longer access pywikibot. It was working fine until this morning - did something change? [19:17:47] MikePeel: hello, which tool [19:17:56] qic [19:18:55] !log tools.toolschecker-ge-ws Moved to --release=buster [19:18:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.toolschecker-ge-ws/SAL [19:19:09] see https://tools-static.wmflabs.org/qic/Fri_03_Jun_2022_04%3A58%3A09_AM_UTC.txt [19:19:34] MikePeel: looks like your tool is affected by https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.org/thread/CUWV6ML7NBLST2XE57BWYM6MV2FVQYOR/. likely you need to re-create the vemv on a buster bastion [19:20:05] instructions for that are here: https://wikitech.wikimedia.org/wiki/News/Toolforge_Stretch_deprecation#Rebuild_virtualenv_for_python_users [19:21:18] urg, that's a big change [19:21:42] is there a way to find what's in the venv and needs to be reinstalled? [19:21:49] (I just got access to this tool a few days ago!) [19:22:38] you probably can activate the venv on a stretch bastion and then use `pip freeze` [19:23:06] !log tools.notwikilambda updated pygments-server to f2c760810b (minor bugfix in pygmentize wrapper script) [19:23:08] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.notwikilambda/SAL [19:23:36] Traceback (most recent call last): [19:23:37]   File "/mnt/nfs/labstore-secondary-tools-project/qic/qic_venv/bin/pip", line 7, in [19:23:37]     from pip._internal.cli.main import main [19:23:38] ModuleNotFoundError: No module named 'pip' [19:23:39] hmm [19:24:29] sounds like that's not a stretch bastion [19:25:39] try `login-stretch.tools.wmflabs.org` [19:26:15] replacing 'login.toolforge.org' ? [19:27:08] yes, that venv was created on debian 9 ('stretch'), so it only works on hosts running debian stretch [19:27:23] ok, that worked, thanks [19:27:38] so it broke when we changed the default from stretch to buster (aka debian 10) [19:27:41] !log tools.notwikilambda stopped webservice and update deployment, something’s broken [19:27:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.notwikilambda/SAL [19:28:04] to fix it, you need to re-create that virtualenv on a host that runs debian 10, such as login.toolforge.org [19:28:26] OK, so I have the list of python modules [19:28:55] so I back out, log in again, and then follow 'Example 1: Upgrading a Stretch grid engine based tool to the Buster grid' - right? [19:29:10] sounds good [19:29:44] @lucaswerkmeister if webservice is on grid there's known issues [19:29:58] I’m aware, this is fully kubernetes [19:30:04] Ah! [19:30:19] (apparently something in vendor/ was so broken that `composer update` failed; doing a fresh install now and it seems to be working better) [19:33:42] thanks. Backing up first! [19:36:42] !log tools.notwikilambda recreated update deployment, update.php crashed within webservice shell and the update deployment should have more memory [19:36:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.notwikilambda/SAL [19:41:39] nope, crashes in k8s too [19:41:56] jem: not sure if you noticed, but jembot seems to be running again since ~19:14 UTC (~30 mintues ago) [19:42:26] !log tools.notwikilambda deleted update deployment again, update.php also crashed in k8s and I don’t want the code to update while the schema is frozen [19:42:28] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.notwikilambda/SAL [19:42:42] !log tools.notwikilambda started webservice again [19:42:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.notwikilambda/SAL [19:47:09] ok, backup done, new venv setup [19:47:26] except that seems to have returned python to 2.7 rather than 3... [19:47:33] & no pip3? [19:48:08] how did you create it? [19:48:20] $ virtualenv venv [19:48:21] $ source venv/bin/activate [19:48:21] $ pip install --upgrade pip # upgrade pip itself to avoid problems with older versions [19:48:23] per instructions [19:48:25] MikePeel: use `python3` on bastions and `python3 -m venv $DIRECTORY` to make a new venv [19:49:00] virtualenv is mostly dead python2 tech [19:49:08] * taavi fixes the docs [19:49:14] thanks. :) [19:50:48] roy649: sigma tool is down because it was migrated to Debian Buster without anyone rebuilding it's venv. I'm going to try and fix that now. [19:51:11] bd808: Yes, I restarted, it worked and I reported here just at 19:14 :) [19:51:23] Thanks anyway [19:51:33] heh. :) I tried [19:51:35] !log tools Scaling webservice nodes to 20, using new 8G swap flavor T309821 [19:51:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [19:51:38] T309821: Buster webservice grid went BOOM! - https://phabricator.wikimedia.org/T309821 [19:59:21] !log tools.sigma Rebuilt $HOME/www/python/venv on Debian Buster with packages from $HOME/requirements-freeze.txt which was made with `pip3 freeze` on Debian Stretch [19:59:23] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sigma/SAL [20:00:42] !log tools.sigma Restarted webservice to use new Debian Buster compatible venv [20:00:43] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sigma/SAL [20:08:03] roy649: https://sigma.toolforge.org/ is alive again [20:09:03] THANKS [20:30:58] !log tools.cobain Shutdown webservice which has been in infinite crash loop for ... months? $HOME/.lighttpd.conf does not work with modern lighttpd versions. [20:31:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.cobain/SAL [20:55:12] !log tools.asurabot Shutdown webservice which has been in infinite crash loop for ... months? $HOME/.lighttpd.conf does not work with modern lighttpd versions. [20:55:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.asurabot/SAL [21:01:22] !log tools.clickstream-api Shutdown webservice which has been in infinite crash loop for a long time. Python2 tool with no source or venv cahnges since 2015 [21:01:26] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.clickstream-api/SAL [21:09:49] OK, all looks good. Huge thanks to taavi and bd808! Update posted on-wiki at https://commons.wikimedia.org/wiki/User_talk:Dschwen#Anything_I_can_do_to_help_with_the_bot%3F . [21:10:32] nice work MikePeel :) [21:11:01] Muchas gracias bd808! [22:04:08] !log tools.ramp Stopping webservice due to infinite crash loop caused by bad $HOME/.lighttpd.conf which was last edited in 2014. Assuming this tool has been broken since migration to the stretch grid in 2019. [22:04:10] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.ramp/SAL [22:08:58] !log tools.saami Shutdown webservice which was in infinite crash loop because of missing $HOME/deno-webservice.bash startup script. Did this ever work? [22:09:03] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.saami/SAL