[09:02:31] !tools.wikibugs restart redis2irc, grrrrit [09:50:49] @bd808 @lucaswerkmeister the bridge bot is sending duplicates to telegram again [09:51:25] !log lucaswerkmeister@tools-sgebastion-10 tools.bridgebot Double IRC messages to other bridges [09:51:29] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.bridgebot/SAL [10:01:55] thx @lucaswerkmeister [10:04:05] !log taavi@tools-sgebastion-11 tools.stewardbots ./stewardbots/StewardBot/manage.sh restart [10:04:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [10:49:21] !log superpes@tools-sgebastion-10 tools.stewardbots Restarted SULWatcher which had quit from IRC [10:49:25] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.stewardbots/SAL [11:21:29] I run some things within WMCS, on a magnum Kube cluster. It's randomly stopped working a week ago, not sure what's went wrong? [11:21:33] https://www.irccloud.com/pastebin/y5FvLQWM/ [11:21:47] cant seem to pull logs on the containers [11:22:48] ah, `Message: The node was low on resource: ephemeral-storage.` [11:40:47] did the readiness probes just become more agressive for toolforge web services? I can't start my tool since this morning (crashloopbackoff), I'm not sure if it became slower to start or if the probes changed [11:44:32] AFAIK there weren’t any readiness probes before (but now there are, yes) [11:44:42] see T341919 [11:44:44] yes, the new webservice release deployed yesterday added some probes (cc dcaro). which tool are you having issues on? [11:45:02] we can check the logs for that crashloopbackoff [11:45:15] (where livenessProbes also added? readiness shouldn't crash stuff) [11:45:32] they’re startup and liveness probes [11:45:36] yep, we added startupProbe and livenessProbe [11:46:00] ok thank you! :) [11:46:32] the startup probe should be checking once per second for ~30s, if your tool takes longer it might not have time to come up, we can extend that time if needed [11:47:41] 20s actually https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/commit/f3bb730630ac7657dccabbf51f849df8059e4185#f253ca4a4a6e084b10292dfd1ee4381ca8234bfb_472_476 [12:42:26] bd808: it seems like the CI config in the wikibugs2 gitlab repo is making all MRs from forks fail due to a missing variable/secret, https://gitlab.wikimedia.org/toolforge-repos/wikibugs2/-/merge_requests/4 [12:42:39] !log tools.wikibugs Updated channels.yaml to: 1d7ee487d6d15c2bd383909a757bb2e2069a6fbe Fix typo in the channel name for wikimedia-data-platform [12:42:42] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL [12:47:14] taavi: :nod: I wondered if that would happen. I’m working on some code layout refactoring and planned to change the config layer next. I should be able to mitigate that problem. It will just involve making the live api subset of tests only run for the maintainers. [12:48:48] Under zuul/jenkins those tests never ran either, so not the worst regression. [13:09:34] my tool is spacemedia. It's a very big application so it takes some time to start, I'll see how to change the startup value [13:15:42] how do we change the value? I don't see anything with webservice -h [13:23:37] there's not, we can adapt to it, how much time does your tool need? [13:23:56] as in we can set a bigger value there (it does not hurt other tools) [13:26:20] I'm patching it locally to see how long it needs [13:27:01] ack [13:29:44] with a 60s initial delay, my tool was able to start in 92.622 seconds [13:30:32] from the application point of view. It might have taken a few seconds more from the container start [13:31:06] with something like 120 seconds it should be fine for me [13:32:27] ack, seems like a lot of startup for an app, out of curiosity, what does it do to get running? [13:39:41] https://gitlab.wikimedia.org/repos/cloud/toolforge/tools-webservice/-/merge_requests/26 <- don-vip that should work right? [14:01:17] @dcaro: it's an app that allows to import pictures from a lot of various sources (see https://commons.wikimedia.org/wiki/Commons:Spacemedia) with the handling of duplicates by computing perceptual hashes. It's also the one that flags new duplicates for Commons administrators. I probably could optimize the startup time a bit but never tried to do so before because ~2min was fine for me [14:02:42] it's ok, yep that rings a bell, it has to load the database in memory on startup or similar right? [14:30:11] I think it's more linked to the lot of tokens and API clients it needs to initialize at startup than the database [14:35:58] oh, interesting, so it starts sessions on the third parties on bootstrap? [14:36:15] !log toolforge deploy webservice 0.103.3 [14:36:16] dcaro: Unknown project "toolforge" [14:36:16] dcaro: Did you mean to say "tools.toolforge" instead? [14:36:25] !log tools deploy webservice 0.103.3 [14:36:27] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [14:36:29] (oops) [14:40:45] don-vip: change deployed :), next time you restart the webservice it should use 120s of startup probe time [14:55:13] thank you! [15:23:09] !log bd808@tools-sgebastion-11 tools.sal Hard restart to pick up new default TCP healthcheck. [15:23:12] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.sal/SAL [20:48:35] How do I reset my password on toolforge? It will not let me in saying my username/password is incorrect [20:51:35] !help [20:51:35] If you don't get a response in 15-30 minutes, please email the cloud@ mailing list -- https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_communication [20:53:07] PotsdamLamb_: https://idm.wikimedia.org/wikimedia/password/ [20:53:14] PotsdamLamb_: which password? [20:53:36] @RhinosF1 I think it just let me in. BRB [20:54:56] yeah it let me in. Thank you. [20:56:33] Cool