[10:06:46] If I click the filename after completing the UploadWizard I get to this : https://tools-static.wmflabs.org/bridgebot/97ee3ef8/file_74740.jpg [10:11:11] !log tools deleting old nginx front proxy instances T283948 [10:11:15] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:11:15] T283948: Merge Toolforge Nginx front proxy into the existing K8s HAProxy setup - https://phabricator.wikimedia.org/T283948 [11:47:32] Does the file show in https://commons.wikimedia.org/wiki/Special:UploadStash ? (re @IVeertje: If I click the filename after completing the UploadWizard I get to this) [12:00:56] Seems there was a spike of UploadStashFileNotFoundException errors between 9:45 UTC and 10:10 UTC. that seems to match with your time of posting this. I'm assuming it got caught up in that. (re @djhartman: Does the file show in https://commons.wikimedia.org/wiki/Special:UploadStash now ? (then it was just delayed).) [12:05:37] ah.. Cookbook sre.hosts.reimage for host ms-be1089.eqiad.wmnet with OS bullseye [12:05:39] I think that's the cause.. [12:27:06] !log wikidata-dev wikidata-icinga-2024: deleted instance, setup was apparently not working after all (cf. T397915) and in any case it does not seem needed nor useful anymore [12:27:09] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikidata-dev/SAL [12:27:10] T397915: wikidata-icinga trying and failing to send mail to root@localhost - https://phabricator.wikimedia.org/T397915 [13:22:52] !log wikidata-dev special-new-lexeme-testing: deleted instance (and proxies berlin-wikidata-uxtest + marburg-wikidata-uxtest), no longer needed [13:22:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Wikidata-dev/SAL [13:49:30] seems like builds-api is down, `toolforge build list` is throwing read timeouts and components-api is failing deployments [13:52:31] Damianz: builds-api looks up, but I can reproduce the read timeouts with "build list" [13:53:22] just tried now and still throwing read timeout [13:53:48] https://pastebin.com/yZtU4wrz [13:57:42] I'm looking at the logs [13:59:39] Checked another tool and the same, `curl --cert .toolskube/client.crt --key .toolskube/client.key -k https://api.svc.tools.eqiad1.wikimedia.cloud:30003/builds/v1/tool/cluebotng/builds` is just seemingly hanging [13:59:50] I see very long latency values in the logs for builds-api [13:59:55] I'll try restarting builds-api pods [14:00:05] A direct curl got 504 gateway timeout from nginx fwiw [14:00:45] restarting didn't help [14:01:07] I deployed builds-api not long ago (the change adding the git sha) [14:01:52] are functional tests passing? [14:02:52] yep, they did pass [14:03:17] I see this in the api-gateway [14:03:19] api-gateway api-gateway-nginx-7bc7597f56-8fljb nginx 192.168.166.0 - - [23/Oct/2025:14:02:01 +0000] "GET /builds/v1/tool/wm-lol/builds HTTP/1.1" 499 0 "-" "tool-wm-lol@tools-bastion-15:builds-cli toolforge_weld/1.6.11 python-requests/2.32.3" [14:03:48] let me revert the deploy [14:03:51] (just in case) [14:04:13] these are the tests of the deployment I did https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/1009#note_171073 [14:04:23] "list build" functional test is failing now [14:05:38] "toolforge build list" works now, so the revert was effective [14:05:50] but it's strange that the functional tests did not fail immediately after deploy [14:05:51] let's move to -admin [14:05:56] dcaro: +1 [14:09:09] that change should just add 1 extra lookup to k8s, perhaps that is taking an excessive amount of time to return... can look at it but kinda hard to check when it worked locally and I don't have other access for a more loaded check [14:09:24] builds are going now, so I'll finish off these deployments then can have lunch finally [15:02:07] Thanks very much. I'll fine-tune with them and will send and email + phabricator. They really just wanted to have a talk with someone, so they can be sure how to go about it and what's even possible from WMF side. But we'll try via email / phab` and see how it goes. Thank you! (re @wmtelegram_bot: @Shani: folks can send me emails if they'd like [15:02:07] (bd808@wikimedia.org), but [15:02:07] really the links Taavi gave are how WMCS does...)