[13:40:13] !log upgrading cloudcephosd1017 to bookworm/reef [13:40:13] andrewbogott: Unknown project "upgrading" [13:40:18] !log admin upgrading cloudcephosd1017 to bookworm/reef [13:40:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [14:41:11] Hi, was there an outage for web services on the 12th between ~20:00 and ~21:48, or from kubernetes talking to web services? I had 5 different tools with different apps/workloads all fails between those times, unfortunately we can't access the logs since it's more than 1 hour ago so trying to rule out if the monitoring needs better monitoring. [14:41:11] (Slightly delayed looking at this as I was busy crossing half an ocean last week) [14:41:38] (I don't see anything mentioned in the IRC logs) [15:26:20] !log tools shutdown tools-sgebastion-10 T314665 [15:26:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:26:24] T314665: Toolforge: Replace all bastion with grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665 [15:56:32] !log toolsbeta delete toolsbeta-sgebastion puppet prefix [15:56:34] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Toolsbeta/SAL [15:57:02] !log tools delete tools-sgebastion puppet prefix T314665 [15:57:06] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [15:57:06] T314665: Toolforge: Replace all bastion with grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665 [16:40:25] Hi, I am getting this error on doing `webservice python3.13 shell` in the firstedit tool: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "-": executable file not found in $PATH: unknown [16:42:06] my newbie friend set up the tool, he messed up something somewhere ... but I'm not sure what (he said he "deleted some root files") [16:47:30] DamianZ is that UTC times? [16:47:32] sd0001: there's an `extra_args: --mount=all` in service.template that's breaking things [16:48:07] that key normally is an array, so this is how `webservice` is interpreting it: https://phabricator.wikimedia.org/P83377 [16:48:25] but extra_args is not what you want, you want `mount: all` in service.template directly [16:51:34] sd0001_: looks like your connection flapped, so in case you missed it you have an answer in the channel logs (https://wm-bot.wmcloud.org/logs/%23wikimedia-cloud/20250916.txt) [16:52:02] thanks [16:52:03] taavi: i saw the malformed mount=all in service.manifest and deleted it about 10 mins ago, but that doesn't seem to have fixed the issue [16:52:33] try `webservice stop` to clear the running tool manifest data as well? [16:54:35] ok tried that too, but still the same error [16:55:35] I still see `extra_args: --mount=all` in the template file [16:57:12] really? this is what I see: https://phabricator.wikimedia.org/P83379 [16:57:40] note the service.manifest and service.template are two different files [16:58:32] oh ok sorry, didn't realize there are two service.* files in play here [16:58:37] the manifest file is maintained automatically (and is fine here), the template file is something you can configure following https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Webservice_templates and has the problematic stuff [17:00:26] that fixed the issue. Thanks much! [20:01:14] dcaro CEST, -2 for UTC. https://cluebotng-monitoring.toolforge.org/d/0403b558-1fa5-4c14-b495-fd8f31fca77f/web-services?orgId=1&from=2025-09-12T17:30:00.000Z&to=2025-09-12T20:30:00.000Z&timezone=utc&var-domain=cluebotng-editsets.toolforge.org&var-domain=cluebotng-review.toolforge.org&var-domain=cluebotng-staging.toolforge.org&var-domain=cluebotng-tr [20:01:15] ainer.toolforge.org&var-domain=cluebotng.toolforge.org for reference of exact times