[00:13:13] !log cloudinfra enc-2.cloudinfra.eqiad1.wikimedia.cloud: `service puppet-enc-git-worker restart` (T329589) [00:13:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [00:13:16] T329589: gerrit copy of cloud/instance-puppet stopped replicating - https://phabricator.wikimedia.org/T329589 [00:18:02] !log cloudinfra enc-2.cloudinfra.eqiad1.wikimedia.cloud: `shutdown -r now` (T329589) [00:18:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [03:55:43] croptool down :) [03:57:15] danmichaelo: ^ [03:59:54] !log tools.croptool Ran because it wasn't running post-outage [03:59:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.croptool/SAL [04:00:07] stemoc: {{fixed}} [04:00:23] \o/ [04:37:14] PAWS is still dead. Is it proper to file an Unbreak now report on Phabricator? [04:37:42] Also, BaGLAMa2 is borked it seems - I’m not sure if it requires a special restart procedure - https://glamtools.toolforge.org/baglama2/index.html# [04:38:32] T329581 [04:39:27] Fuzheado: can you expand on what you mean by borked? [04:39:56] idk why the bot didn't expand the title. that ticket is PAWS [05:00:19] T329581 [05:00:20] T329581: PAWS down - https://phabricator.wikimedia.org/T329581 [05:00:37] It may be ignoring the bridgebot [05:01:59] blah T329581 [05:02:26] blah? [05:02:29] or it will only respond to just the task id with nothing before [05:02:56] or it doesn't respond to duplicate of recent expansion [05:03:28] bridgebot put the username before the task id, and so the bot didn't respond [05:06:16] clearly not blah. and you didn't rule out duplicate. https://t.me/wmcloudirc/49356 [05:06:33] that was a test [05:07:13] I understood that and I gave you more things to test [05:07:28] and with that good night :) [05:07:38] 1Legoktm: there should be a long list of categories that it is tracking, but it is all blank [05:09:18] 2023-02-14 04:46:48: (mod_fastcgi.c.421) FastCGI-stderr: PHP Warning: mysqli::set_charset(): Couldn't fetch mysqli in /data/project/glamtools/baglama2/baglama.php on line 157 [05:09:45] lemme try restarting, that error only showed up today [05:12:19] Fuzheado: is it better now? [05:20:57] legoktm still missing cats. cf. https://web.archive.org/web/20230209152655/https://glamtools.toolforge.org/baglama2/ [05:21:09] /https://glamtools.toolforge.org/baglama2/ [05:21:28] https://glamtools.toolforge.org/baglama2/ [05:22:08] :/ ok, I don't think I can help, will need Magnus to intervene [05:34:23] 1Thanks anyway. Yeah, I’m afraid it’s still no bueno [05:35:12] Petscan needed a manual restart after the Cloud outage. It’s very possible BaGLAMa2 needs some manual restart too [06:00:08] I did a `webservice restart`, but it didn't seem to be enough [06:01:16] I was thinking maybe DB missing a la mars or corrupt somehow [08:09:28] !log cloudinfra arm keyholder on enc-2 T329589 [08:09:31] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Cloudinfra/SAL [08:09:32] T329589: gerrit copy of cloud/instance-puppet stopped replicating - https://phabricator.wikimedia.org/T329589 [08:15:42] !log paws empty profile::wmcs::paws::control_nodes hiera key to bring PAWS back up (T329581), it contained the hostnames of the old kubeadm backed cluster which should be cleaned up properly in T327674 [08:15:46] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [08:15:47] T327674: Remove puppet code related to paws kubeadmin cluster - https://phabricator.wikimedia.org/T327674 [08:15:47] T329581: PAWS down - https://phabricator.wikimedia.org/T329581 [08:16:43] PAWS is back up I think [08:18:40] !log project-proxy remove proxies referring to maps-tiles1 to get the maps proxy back up [08:18:41] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Project-proxy/SAL [08:27:27] !log tools.quickstatements run the './start_bot.sh' script to try to start background batch processing [08:27:28] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickstatements/SAL [12:02:53] !log toolsbeta included tools-manifests 0.25 in toolsbeta-buster aptly repo (T329611, T329467, T244809) [12:02:59] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [12:02:59] T329611: Toolforge grid: start webservices after outage - https://phabricator.wikimedia.org/T329611 [12:02:59] T329467: remove webservicemonitor (down due to DNS errors) - https://phabricator.wikimedia.org/T329467 [12:02:59] T244809: Remove or fix stats collecting from tools-manifest (webservice-monitor) - https://phabricator.wikimedia.org/T244809 [12:09:03] do we have any webservice in the grid that is DOWN as a result of yesterday outage that should be UP instead? I'm deploying an update to the mechanism that should make keep it auto UP [12:09:57] !log tools included tools-manifests 0.25 in tools-buster aptly repo, deploying it now! (T329611, T329467, T244809) [12:10:02] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [12:10:02] T329611: Toolforge grid: start webservices after outage - https://phabricator.wikimedia.org/T329611 [12:10:03] T329467: remove webservicemonitor (down due to DNS errors) - https://phabricator.wikimedia.org/T329467 [12:10:03] T244809: Remove or fix stats collecting from tools-manifest (webservice-monitor) - https://phabricator.wikimedia.org/T244809 [12:12:10] !log tools the fixed webservicemonitor is starting a bunch of grid webservices (T329611) [12:12:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [13:06:51] !log tools.my-first-pywikibot-tool shut down webservice [13:06:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.my-first-pywikibot-tool/SAL [13:16:55] !log tools.hexacore removed service.manifest, tool is otherwise totally empty so it was failing to start. marked for deletion as it's been inactive since 2017 [13:16:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.hexacore/SAL [13:17:02] !log admin restarting all eqiad1 openstack services because that seems to sometimes help things *shrug* [13:17:04] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [13:27:45] !log paws Bump oauthlib 97f241bacff7af60cebfffa0d8eb945c7545577d [13:27:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Paws/SAL [13:29:41] !log tools.facebook-messenger-chatbot disable webservice failing to start due to dependency issues, left the maintainer a talk page message [13:29:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.facebook-messenger-chatbot/SAL [13:38:25] !log tools.movestats disable webservice failing to start due to dependency issues, left the maintainer a talk page message [13:38:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.movestats/SAL [13:43:12] !log tools.crosswatch disable webservice failing to start due to dependency issues, disabled as maintainer has been inactive and unreachable since 2015 [13:43:13] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.crosswatch/SAL [13:52:33] !log tools.germancon-mobile Disabled cron jobs updating the schedule of a conference in 2019 that were running every minute. Not a great use of shared resources. [13:52:35] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.germancon-mobile/SAL [14:34:37] !log tools.pmidtool disable webservice failing to start, left the maintainer a talk page message [14:34:38] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.pmidtool/SAL [15:07:46] !log tools import cert-manager components to local docker registry T329453 [15:07:49] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:07:49] T329453: Deploy cert-manager to Toolforge - https://phabricator.wikimedia.org/T329453 [15:26:54] !log glams create proxy glams and musei as requested by Dario Crespi WMIT [15:26:56] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Glams/SAL [17:33:42] Happy #ilovefs day everyone! Thank you to all of y'all who make FOSS tools and share them with the world. <3 [19:26:08] are there any known NFS issues due to yesterday’s outage? [19:26:19] this is what the lexeme-forms tool’s git repo looks like : https://tools-static.wmflabs.org/bridgebot/7322e77c/file_45450.jpg [19:26:36] but according to the SAL at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL, HEAD should be at “update Danish noun template” [19:27:07] no, and the Toolforge NFS servers were unaffected by the outage [19:27:22] okay, I peeked at the ~/.bash_history and it looks like I just forgot to git rebase [19:27:58] so, my bad, sorry ^^ [19:30:07] !log tools.lexeme-forms deployed bfd63ebac1 (l10n updates: hno); also, turns out I didn’t git rebase in the last deployment, so this *actually* deploys the Danish nouns update and pl l10n update [19:30:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [19:31:56] but updates of SAL could have been affected? [19:40:54] at first I thought fs was filesystem (re @wmtelegram_bot: Happy #ilovefs day everyone! Thank you to all of y'all who make FOSS tools and share them with the world. <3) [19:52:17] “Device or resource busy: '.nfs0000000006e42f5500000002'” joy [19:52:26] I assume this means I need to stop the webservice before I can update the venv [19:54:25] Sounds like it, or at least that sounds like you did an `rm` on a file that was still open and the NFS server gave it a .nfs* orphan inode name. [19:54:50] yeah, this happens when pip tries to remove an old package version [19:55:46] zero downtime updates are not very likely to work until we get buildpacks and the custom containers they will create. [19:56:34] I mean, I’ve been doing them pretty successfully for some time now [19:56:49] anyway, moving the venv out of the way and making a new one instead seems to have worked better [19:57:22] (this would probably blow up now if the running webservice tried to load any new packages, but I have no reason to assume it will ^^) [19:58:54] !log tools.lexeme-forms deployed 9debac9385 (update dependencies, especially Werkzeug 2.2.3 with two security fixes; venv rebuilt from scratch to avoid NFS issues) [19:58:57] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [20:00:06] bd808: I patched the webservice deployment to add a startup probe https://gitlab.wikimedia.org/toolforge-repos/lexeme-forms/-/blob/main/patch-add-startup-probe.yml [20:00:07] so if I restart with `kubectl rollout restart deployment lexeme-forms` instead of `webservice restart`, k8s will only terminate the old pod once the new one is ready [20:00:37] :oooo [20:00:54] of course it’s not the same as having the dependencies isolated in a buildpacked container rather than shared over NFS, but it’s been working pretty well for me [20:01:12] (I’m still looking forward to the things to come though :)) [20:03:24] Are all the scripts for pwb missing on PAWS now? I only get for example "ERROR: replace.py not found! Misspelling?" for all the scripts i have tried [20:06:32] They might be. Do you know if there was a path that pywikibot was searching? 8 has updated a number of things [20:08:22] AFAIK there is/was an issue with the wrapper pwb.py script [20:08:46] not sure if it affected PAWS, but certainly did affect me on Toolforge [20:08:54] let me search the Task [20:09:04] I'm not sure that the scripts are part of the pywikibot 8.0 pypi package... [20:09:46] Rook: I guess there was a path that pywikibot was, but I dont know what it was. Tried searching for category.py or replace.py with find on PAWS, but i could not find it. [20:10:04] that would explain it [20:10:08] tholme: are you using it with pwb.py ? [20:10:15] tholme: I couldn't either, I'm loading up an older version locally to see if it had it there [20:10:38] https://phabricator.wikimedia.org/T324287 <-- maybe? [20:11:09] if it's in paws it no longer has pwb.py, seemed to vanish with 8, though I switched to pip install to get it working, the old (I think the code was 7 years old now) way wasn't working [20:12:50] only pwb is working on PAWS now (not pwb.py, it gives error bash: pwb.py: command not found) [20:12:50] The old `git clone --recursive` would have the scripts collection. I'm pretty sure that the pip package does not include the scripts. [20:13:31] Looks like it use to be in /srv/paws/pwb/scripts I wonder where it calls it from... [20:18:17] in /srv/paws/lib/python3.10/site-packages/pywikibot now, and indeed that looks to only contain some basic scripts, login version few others. Let's try git [20:25:36] `pwb` alone works [20:26:38] !log tools.wd-image-positions deployed d498c2ecbd (update dependencies, especially Werkzeug 2.2.3 with two security fixes) [20:26:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wd-image-positions/SAL [20:27:52] btw, for other python tool maintainers: I’m happy with `pip-tools` and would recommend it :) [20:28:14] in a nutshell, it gives you a “separate dependencies and lockfile” workflow like you know from e.g. npm or composer [20:28:45] (rather than the usual python experience, where `pip install -r requirements.txt` installs who knows which versions of the listed packages) [20:29:20] apparently there’s a lot of discussions about python packaging happening at the moment but those are my two cents ^^ [20:32:34] Looking inside the pypi pywikibot8 pack, there are missing files I'd expect to find in /scripts [20:33:04] (hilariously at $work we're looking to replace pip-tools with other stuff, most likely poetry, but for the Toolforge use case it's probably pretty good) [20:35:04] I switched to a pip install because I was getting a requests error when trying a manual install with git https://phabricator.wikimedia.org/T326512 [20:35:04] What do we think we would break by overwriting the contents of /srv/paws/lib/python3.10/site-packages/pywikibot/scripts with the contents of the scripts directory out of git? [20:36:16] legoktm: oh no :D [20:36:51] Rook: what about fixing the pip file and reloading afterwards? [20:37:03] is poetry the one that also wants to manage python itself? in that case it would probably not work well for toolforge, yeah [20:37:25] herzog: I'm not sure what it means to fix a pip file? [20:37:27] reloading = pip install etc [20:38:04] Rook: sorry, I mean, let PWB folks know the package is missing files, etc [20:38:08] and then pip install again? [20:38:57] Oh, I was under the assumption that the extra scripts were intentionally left out of the pip version. Perhaps this is not the case? [20:39:38] !log tools.quickcategories deployed 011455de8f (update dependencies, especially Werkzeug 2.2.3 with two security fixes) [20:39:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [20:39:46] As they're telling me, yes, they're intentionally out [20:39:49] hmpf [20:40:59] Unfortunately, the pywikibot PyPI package does not include most scripts. You need to install from git. [20:41:00] Yeah, you need to install from git to use the scripts [20:44:09] Any thoughts on what is causing the requests issue when installing from git? [20:46:30] Could just add a function that solves everything :p https://www.irccloud.com/pastebin/8kbgeKmD/ [20:48:53] @lucaswerkmeister: no (you're probably thinking of conda), poetry is more like npm/composer/cargo in which you specify dependencies and it locks them and (optionally) sets up a virtualenv for you [20:49:08] ok thanks, I lost track then :) [20:50:33] related: T229172 [20:50:34] T229172: Add pipenv or poetry support for webservice-python-bootstrap script - https://phabricator.wikimedia.org/T229172 [20:50:44] Rook: not sure. is there a demo/test place that I can poke around where it is installed from git? [20:52:19] !log toolsbeta deploy cert-manager to toolsbeta T329453 [20:52:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [20:52:22] T329453: Deploy cert-manager to Toolforge - https://phabricator.wikimedia.org/T329453 [20:52:23] I regularly use git and PyPI installs on Toolforge k8s without seeing an issue with requests. [20:54:02] I should be able to set one up in paws-dev, though the container would have to be built for it which takes awhile. Probably faster to run `RUN git clone --branch 8.0.0 --recursive https://gerrit.wikimedia.org/r/pywikibot/core.git /srv/paws/pwb` from paws and use `pwb.py` [20:54:56] JJMC89: which versions are you running with git? [20:55:49] I use the stable branch from git, which is 8.0.0 with some i18n updates that would not impact this [21:03:19] Odd, running with python seems to work https://www.irccloud.com/pastebin/dRdRPdU7/ [21:05:49] is that python different from the one at /usr/bin/python3 (from the pwb.py shebang)? [don't know how things are setup in PAWS] [21:07:02] No, one is a link to the other [21:09:43] Which trickles down through a series of links to the same /usr/bin/python3 called in the #! [21:09:45] !log tools.quickcategories deployed 3748a8600e (fix empty titles in runner; hopefully resolves the CrashLoopBackOff which was at 21487 restarts 😬) [21:09:47] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.quickcategories/SAL [21:10:48] looks like that worked, phew [21:11:05] sorry ’bout all those crashes [21:13:52] don't know why that would happen then [21:14:12] Yes, I'm finding it mysterious myself [21:14:18] am I able to access the dev instance somewhere? [21:16:22] Oh I haven't built one, I just have been playing with the main paws after doing a git install of pwb [21:16:37] Probably a relative path problem https://www.irccloud.com/pastebin/jRmCPemL/ [21:17:01] I ran the git clone after connecting up to hub.paws.wmcloud.org [21:18:46] I believe this is the change that is making it fail in paws https://www.irccloud.com/pastebin/YCnoMubS/ [21:19:37] Rook: is paws using a venv? [21:20:22] RhinosF1: it is [21:21:16] Rook: then referring back to system python will probably break it [21:21:30] Because yes, the package is not there [21:21:40] RhinosF1: I concur [21:22:20] Why is pywikibot not using a shell script thing though, that would handle all that as long as the venv’s bin is on $PATH [21:22:35] !log tools.speedpatrolling deployed 8b7b5c0ca6 (update dependencies, especially Werkzeug 2.2.3 with two securtiy fixes) [21:22:36] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.speedpatrolling/SAL [21:22:56] RhinosF1: sorry I don't follow, what's a shell script thing in this case? [21:23:37] so yall are correct, /usr/bin/python3 and /usr/bin/env python3 are two different enviornments [21:24:01] Rook: you can set entry points in the package metadata so python generates a proper file that will work but that would be something like ‘pwb’ instead of ‘pwb.py’ [21:24:14] JJMC89: I don't suppose the reasoning behind switching to /usr/bin/python3 was minor and it could switch back? [21:26:52] we do have a pwb entry point - pwb.py was kept for backwards compatibility; though the entry point doesn't work for most scripts T324287 [21:26:53] T324287: pwb console script doesn't find scripts in the scripts folder but pwb.py does - https://phabricator.wikimedia.org/T324287 [21:28:28] Rook: not sure why it got changed - the env version is more flexible [21:29:30] JJMC89: Score! A potentially simple path forward may exist! [21:29:56] Rook: it looks like the change was an oversight [21:29:58] !log tools.pagepile-visual-filter deployed f9dab89739 (update dependencies, especially Werkzeug 2.2.3 with two security fixes) [21:30:00] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.pagepile-visual-filter/SAL [21:30:07] So maybe someone can do a patch [21:30:51] it was likely to be consistent with our other shebangs, which don't use env [21:31:00] I'll put up a patch to change them all [21:32:55] Ah all is better than one, I'll close my patch [21:35:43] !log tools.ranker deployed a78bc0cfd0 (update dependencies, especially Werkzeug 2.2.3 with two security fixes) [21:35:45] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.ranker/SAL