[00:10:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [00:31:41] 10VPS-project-Codesearch: Codesearches are timing out (2025-03-17) - https://phabricator.wikimedia.org/T389027#10644807 (10Dylsss) 05Open→03Invalid Weird, it's working fine for me now too. Not sure why I was having trouble with it yesterday, sorry about that. [00:35:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [00:35:33] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:05:18] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [01:24:49] 10wikitech.wikimedia.org, 10MediaWiki-extensions-OATHAuth, 07TestMe, 07Wikimedia-production-error: OATHAuth's disableOATHAuthForUser.php script triggers a Notification that can't be sent as MW isn't initialised yet, so causes a production error - https://phabricator.wikimedia.org/T306184#10644880 (10matmare... [01:24:56] 10wikitech.wikimedia.org, 10MediaWiki-extensions-OATHAuth, 07TestMe, 07Wikimedia-production-error: OATHAuth's disableOATHAuthForUser.php script triggers a Notification that can't be sent as MW isn't initialised yet, so causes a production error - https://phabricator.wikimedia.org/T306184#10644884 (10mat... [01:46:20] 10Cloud-VPS (Project-requests): Request creation of futureaudiences VPS project - https://phabricator.wikimedia.org/T389158 (10derenrich) 03NEW [04:46:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [05:11:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [05:16:18] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [05:48:42] 06cloud-services-team: Analyze Toolforge and Toolsbeta for Virtual Resource Usage - https://phabricator.wikimedia.org/T389081#10645145 (10Chuckonwumelu) [05:51:00] 06cloud-services-team, 10Toolforge: Analyze Toolforge and Toolsbeta for Virtual Resource Usage - https://phabricator.wikimedia.org/T389081#10645146 (10taavi) [09:57:36] !log dcaro@acme tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-9 (T383238) [09:57:40] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [09:57:41] T383238: [nfs] 2025-01-08 tools-nfs outage - https://phabricator.wikimedia.org/T383238 [10:00:05] 06cloud-services-team, 10Toolforge: Analyze Toolforge and Toolsbeta for Virtual Resource Usage - https://phabricator.wikimedia.org/T389081#10645696 (10dcaro) p:05Triage→03Medium [10:00:15] 10Toolforge (Toolforge iteration 18): [components-api,buildsa-api] When building and deploying, if none of the settings changed, the jobs are not restarted - https://phabricator.wikimedia.org/T389044#10645697 (10dcaro) p:05Triage→03High [10:00:22] 10Toolforge (Toolforge iteration 18): [builds-api] Store the commit hash that was used for the build - https://phabricator.wikimedia.org/T389043#10645698 (10dcaro) p:05Triage→03Medium [10:00:52] 10Toolforge (Toolforge iteration 18), 13Patch-For-Review: [jobs-api] refactor models - https://phabricator.wikimedia.org/T389118#10645699 (10dcaro) p:05Triage→03High [10:01:21] 10Toolforge (Toolforge iteration 18), 13Patch-For-Review: [jobs-api] refactor models - https://phabricator.wikimedia.org/T389118#10645700 (10dcaro) [10:03:18] !log dcaro@acme tools END (FAIL) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=99) for tools-k8s-worker-nfs-9 (T383238) [10:03:22] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:03:23] T383238: [nfs] 2025-01-08 tools-nfs outage - https://phabricator.wikimedia.org/T383238 [10:05:35] 06cloud-services-team, 10Toolforge: Analyze Toolforge and Toolsbeta for Virtual Resource Usage - https://phabricator.wikimedia.org/T389081#10645708 (10aborrero) Something worth exploring as part of this project: openstack puppet hiera & opentofu integration. Currently, we have a number of puppet settings for... [10:30:19] !log dcaro@acme tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-9 [10:30:24] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:35:55] !log dcaro@acme tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-9 [10:35:58] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/SAL [10:55:33] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-9 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [11:04:11] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:05:09] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:05:57] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:12:39] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:17:37] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:25:00] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:26:04] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:38:48] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:43:32] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:47:22] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:49:37] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:51:10] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [11:56:27] 10Cloud-VPS (Quota-requests), 06Traffic: Increase quota for Traffic cloud project - https://phabricator.wikimedia.org/T389196 (10Fabfur) 03NEW [12:30:58] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [12:33:48] 10Cloud-VPS (Quota-requests), 06Traffic: Increase quota for Traffic cloud project - https://phabricator.wikimedia.org/T389196#10646328 (10aborrero) I see the project has: ` 9 / 12 instances. 12 / 12 VCPUs. 23.0 GB / 24.0 GB RAM. 5 / 80 GB volume space. ` (per https://openstack-browser.toolforge.org/project/t... [12:34:09] 10Cloud-VPS (Quota-requests), 06Traffic: Increase quota for Traffic cloud project - https://phabricator.wikimedia.org/T389196#10646329 (10aborrero) p:05Triage→03Medium [12:34:23] 10Cloud-VPS (Quota-requests), 06Traffic: Increase quota for Traffic cloud project - https://phabricator.wikimedia.org/T389196#10646332 (10aborrero) a:03Raymond_Ndibe [13:28:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [13:37:29] 06cloud-services-team, 10Horizon, 13Patch-For-Review: horizon: some users get 401 unauthorized - https://phabricator.wikimedia.org/T388137#10646528 (10Andrew) 05Open→03Resolved [13:50:31] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [13:51:50] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [13:56:50] 06cloud-services-team, 10Cloud-VPS: Upgrade cloud-vps openstack to version 'Dalmation' - https://phabricator.wikimedia.org/T381499#10646730 (10Andrew) a:03Andrew [13:57:55] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [14:04:36] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [14:08:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-54 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [14:10:04] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [14:22:22] 10Cloud Services Proposals, 06cloud-services-team, 10Cloud-VPS: Decision Request - How openstack projects relate to tofu-infra - https://phabricator.wikimedia.org/T385604#10646873 (10Andrew) Option 2 seems right to me! [14:23:37] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [14:24:36] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [14:28:27] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS, 13Patch-For-Review: tofu-infra: refactor repo structure - https://phabricator.wikimedia.org/T375283#10646890 (10aborrero) with todays refactored code I was able to get codfw1dev deployment into a tofu NOOP, so that's good. [14:52:40] 10Cloud-VPS (Quota-requests), 06Traffic: Increase quota for Traffic cloud project - https://phabricator.wikimedia.org/T389196#10647003 (10Fabfur) @aborrero thanks for this, yeah no need to double volume space, sorry! [14:59:28] FIRING: InstanceDown: Project tools instance tools-prometheus-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:24:28] RESOLVED: InstanceDown: Project tools instance tools-prometheus-6 is down - https://prometheus-alerts.wmcloud.org/?q=alertname%3DInstanceDown [15:38:08] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 18): Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#10647204 (10dcaro) @fnegri hi! What is this currently blocked on? [15:40:35] 06cloud-services-team, 10Toolforge (Toolforge iteration 18), 07Epic, 13Patch-For-Review: [o11y,logging,infra] loki into lima-kilo - https://phabricator.wikimedia.org/T386480#10647216 (10dcaro) a:05rook→03None [15:42:28] 10Toolforge (Toolforge iteration 18), 07Documentation, 13Patch-For-Review: [harbor,docs] Improve Harbor quota handling and docs - https://phabricator.wikimedia.org/T351092#10647234 (10dcaro) 05In progress→03Resolved [15:46:18] (03update) 10dcaro: [jobs-api] create seperate api.py and move flask things there [repos/cloud/toolforge/jobs-api] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/91 (https://phabricator.wikimedia.org/T359804) (owner: 10raymond-ndibe) [15:47:43] 06cloud-services-team, 10Toolforge, 07Epic, 13Patch-For-Review: [o11y,logging,infra] loki into lima-kilo - https://phabricator.wikimedia.org/T386480#10647267 (10dcaro) [15:49:28] 06cloud-services-team, 10Toolforge: [jobs-api] rename variable/parameter type to job_type - https://phabricator.wikimedia.org/T387727#10647273 (10dcaro) [15:51:12] 06cloud-services-team, 10Toolforge (Toolforge iteration 18), 10Sustainability (Incident Followup): [docs,envvars-api,jobs-api,builds-api] create docs on how to operate the cluster and core components - https://phabricator.wikimedia.org/T380959#10647294 (10dcaro) [15:53:13] 06cloud-services-team, 10Toolforge: [builds-cli,builds-api] `build quota` fails if tool has no builds - https://phabricator.wikimedia.org/T353701#10647316 (10dcaro) [15:56:14] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [jobs-api] refactor models - https://phabricator.wikimedia.org/T389118#10647336 (10dcaro) [15:56:17] 10Toolforge (Toolforge iteration 19): [components-api,buildsa-api] When building and deploying, if none of the settings changed, the jobs are not restarted - https://phabricator.wikimedia.org/T389044#10647337 (10dcaro) [15:56:23] 10Toolforge (Toolforge iteration 19): [builds-api] Store the commit hash that was used for the build - https://phabricator.wikimedia.org/T389043#10647338 (10dcaro) [15:56:27] 10Toolforge (Toolforge iteration 19): [components-api] use the component name for the image instead of the default tool - https://phabricator.wikimedia.org/T388830#10647339 (10dcaro) [15:56:30] 10Toolforge (Toolforge iteration 19): [toolforge,jobs] "toolforge jobs logs" fails when job has not started yet - https://phabricator.wikimedia.org/T349775#10647340 (10dcaro) [15:56:32] 10Toolforge (Toolforge iteration 19): [builds-api] Limit the amount of running builds - https://phabricator.wikimedia.org/T388706#10647341 (10dcaro) [15:56:35] 10Toolforge (Toolforge iteration 19): [components-api] allow stopping a deployment that's running - https://phabricator.wikimedia.org/T388644#10647342 (10dcaro) [15:56:36] 10Toolforge (Toolforge iteration 19): [components-api] restrict running deplpoyments to 1 - https://phabricator.wikimedia.org/T388643#10647343 (10dcaro) [15:56:39] 10Toolforge (Toolforge iteration 19): [components-api] Rename the CRDs groups to be `components-api.toolforge.org` - https://phabricator.wikimedia.org/T386829#10647344 (10dcaro) [15:56:40] 10Toolforge (Toolforge iteration 19): [usage] Try to get an idea of the amount of tools that were created, but never started anything - https://phabricator.wikimedia.org/T379144#10647346 (10dcaro) [15:56:42] 06cloud-services-team, 10Toolforge (Toolforge iteration 19): Upgrade python buildpack to v0.17.0 or newer for Poetry support - https://phabricator.wikimedia.org/T374056#10647347 (10dcaro) [15:56:50] 10Toolforge (Toolforge iteration 19): [jobs-api] prepend date and pod name to filelog lines - https://phabricator.wikimedia.org/T372025#10647348 (10dcaro) [15:56:54] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 10Sustainability (Incident Followup): [docs,envvars-api,jobs-api,builds-api] create docs on how to operate the cluster and core components - https://phabricator.wikimedia.org/T380959#10647345 (10dcaro) [15:56:58] 10Toolforge (Toolforge iteration 19): [components-api] Add support for port/helathcheck for continuous jobs in tool config/depolyment - https://phabricator.wikimedia.org/T362072#10647350 (10dcaro) [15:57:02] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 19), 07Epic: [KR] WE6.3 Introduce a sustainability scoring system for the Toolforge platform - https://phabricator.wikimedia.org/T368600#10647349 (10dcaro) [15:57:15] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Cloud-VPS (Debian Buster Deprecation), 10Toolforge (Toolforge iteration 19), 07Epic, 05Goal: Toolforge: migrate to Debian Bullseye or later - https://phabricator.wikimedia.org/T311897#10647351 (10dcaro) [15:57:19] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 07Epic: [jobs-api,webservice] Run webservices via the jobs framework - https://phabricator.wikimedia.org/T348755#10647356 (10dcaro) [15:57:23] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [jobs-api] "toolforge jobs logs -f" should get the logs of all containers in all target pods - https://phabricator.wikimedia.org/T388274#10647354 (10dcaro) [15:57:27] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [builds-builder] Add support for Heroku's "24" builder stack based on Ubuntu 2024.04 noble - https://phabricator.wikimedia.org/T380127#10647358 (10dcaro) [15:57:31] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [jobs-api,jobs-cli] restarting a continuous jobs causes for some seconds two jobs are running side by side - https://phabricator.wikimedia.org/T375366#10647360 (10dcaro) [15:57:35] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 19), 07Epic: [components-api] First iteration of the component API - https://phabricator.wikimedia.org/T362051#10647364 (10dcaro) [15:57:39] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [jobs-api] Split the API, business, and k8s models - https://phabricator.wikimedia.org/T359808#10647362 (10dcaro) [15:57:43] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: Persist important toolforge k8s components logs - https://phabricator.wikimedia.org/T383081#10647368 (10dcaro) [15:57:47] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: Toolforge: Replace all bastion with grid-less bookworm based bastion hosts - https://phabricator.wikimedia.org/T314665#10647366 (10dcaro) [15:57:51] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.29 - https://phabricator.wikimedia.org/T362868#10647370 (10dcaro) [15:57:55] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 05Goal: [harbor] Move harbor data to object storage service - https://phabricator.wikimedia.org/T350687#10647373 (10dcaro) [15:57:59] 10Toolforge (Toolforge iteration 19), 07Upstream: [builds-builder,jobs-api,upstream] Calling nontrivial Procfile commands with arguments results in confusing error (“no such file or directory”) - https://phabricator.wikimedia.org/T356016#10647379 (10dcaro) [15:58:07] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 19): Intermittent redis connection timeouts in Toolforge - https://phabricator.wikimedia.org/T318479#10647375 (10dcaro) [15:58:11] 10Toolforge (Toolforge iteration 19), 07Upstream: [builds-builder] golang based images get infinite nested loops for procfile entries - https://phabricator.wikimedia.org/T363417#10647381 (10dcaro) [15:58:15] 10cloud-services-team (FY2024/2025-Q3-Q4), 10Toolforge (Toolforge iteration 19), 05Goal, 13Patch-For-Review: [infra] Decommission the Grid Engine infrastructure - https://phabricator.wikimedia.org/T314664#10647377 (10dcaro) [15:58:19] 10Toolforge (Toolforge iteration 19): [toolforge] simplify calling the different toolforge apis from within the containers - https://phabricator.wikimedia.org/T356377#10647383 (10dcaro) [15:58:27] 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [jobs-api] Save business models in a DB - https://phabricator.wikimedia.org/T359650#10647387 (10dcaro) [15:58:31] 06cloud-services-team, 10Toolforge (Toolforge iteration 19), 13Patch-For-Review: [harbor, builds-builder] Audit robot account permissions - https://phabricator.wikimedia.org/T361708#10647385 (10dcaro) [16:03:48] FIRING: PuppetFailure: Puppet has failed on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [16:12:28] FIRING: PuppetAgentFailure: Puppet agent failure detected on instance tools-prometheus-6 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:20:18] RESOLVED: PuppetFailure: Puppet has failed on cloudcontrol2004-dev:9100 - https://puppetboard.wikimedia.org/nodes?status=failed - https://grafana.wikimedia.org/d/yOxVDGvWk/puppet - https://alerts.wikimedia.org/?q=alertname%3DPuppetFailure [16:42:28] RESOLVED: PuppetAgentFailure: Puppet agent failure detected on instance tools-prometheus-6 in project tools - https://prometheus-alerts.wmcloud.org/?q=alertname%3DPuppetAgentFailure [16:48:55] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [16:58:26] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [16:59:05] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [17:09:30] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [17:12:28] 10Cloud-VPS (Project-requests): Request creation of futureaudiences VPS project - https://phabricator.wikimedia.org/T389158#10647814 (10dcaro) Hi! CloudVPS project should be project-based not team-based, maybe the name should be something like `fagameserver`? A couple of side notes (just fyi., you still need... [17:14:02] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [17:17:51] (03update) 10aborrero: Draft: test new project module [repos/cloud/cloud-vps/tofu-infra] - 10https://gitlab.wikimedia.org/repos/cloud/cloud-vps/tofu-infra/-/merge_requests/93 (https://phabricator.wikimedia.org/T375283) (owner: 10fnegri) [17:55:38] (03approved) 10dcaro: [jobs-cli] use imagename in definedjob [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/90 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [17:55:40] (03update) 10dcaro: [jobs-cli] use imagename in definedjob [repos/cloud/toolforge/jobs-cli] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/90 (https://phabricator.wikimedia.org/T389118) (owner: 10raymond-ndibe) [18:04:47] (03approved) 10dcaro: [builds-builder] drop certain permissions for local-image-builder [repos/cloud/toolforge/builds-builder] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/merge_requests/68 (https://phabricator.wikimedia.org/T361708) (owner: 10raymond-ndibe) [18:06:33] 14Toolforge (Software install/update): Install headless browser dependencies on node16 image - https://phabricator.wikimedia.org/T316286#10648243 (10dcaro) As a note for future reference, you can install chromium headless using the build service, example: https://gitlab.wikimedia.org/dcaro/toolforge-chromium... [18:07:36] 10Cloud-VPS (Project-requests): Request creation of futureaudiences VPS project - https://phabricator.wikimedia.org/T389158#10648244 (10dcaro) a:03dcaro [18:07:44] 10Cloud-VPS (Project-requests): Request creation of futureaudiences VPS project - https://phabricator.wikimedia.org/T389158#10648246 (10dcaro) 05Open→03In progress [18:12:21] 10Cloud-VPS (Project-requests): Request creation of futureaudiences VPS project - https://phabricator.wikimedia.org/T389158#10648255 (10derenrich) Hey thanks for the response. I was thinking we could keep the name generic in case this server becomes useful for multiple projects. Yeah we did eventually get ffmp... [18:22:03] 06cloud-services-team, 10Cloud-VPS: Members of https://ldap.toolforge.org/group/project-recommendation-api not added to project-bastion - https://phabricator.wikimedia.org/T387226#10648328 (10Andrew) 05Duplicate→03Open I'm not sure this is a duplicate of T379550. @XiaoXiao-WMF, you're talking about a mem... [18:25:24] 06cloud-services-team, 10Cloud-VPS: Members of https://ldap.toolforge.org/group/project-recommendation-api not added to project-bastion - https://phabricator.wikimedia.org/T387226#10648352 (10Andrew) 05Open→03Resolved Ah! I was confused by the title linking to ldap. We're talking about the the cloud-vp... [20:03:08] (03PS5) 10Amire80: WIP Merge lego footer messages into one [labs/tools/Isa] - 10https://gerrit.wikimedia.org/r/1003574 (https://phabricator.wikimedia.org/T355011) [20:10:45] 06cloud-services-team, 10Cloud-VPS: Upgrade cloud-vps openstack to version 'Dalmation' - https://phabricator.wikimedia.org/T381499#10648661 (10Andrew) [22:57:52] 10Cloud-VPS (Project-requests): Request creation of futureaudiences VPS project - https://phabricator.wikimedia.org/T389158#10649435 (10bd808) >>! In T389158#10648255, @derenrich wrote: > I was thinking we could keep the name generic in case this server becomes useful for multiple projects. > ... > I anticipate...