[00:49:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-45 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [00:59:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-45 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [09:28:26] (03approved) 10fnegri: maintain-harbor: increase memory quota for mbh [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/728 (https://phabricator.wikimedia.org/T389733) (owner: 10dcaro) [09:30:30] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-harbor (T389733) [09:30:34] T389733: Increase RAM quota for mbh tool - https://phabricator.wikimedia.org/T389733 [09:42:41] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-harbor (T389733) [09:42:45] T389733: Increase RAM quota for mbh tool - https://phabricator.wikimedia.org/T389733 [09:59:31] (03update) 10fnegri: maintain-kubeusers: increase memory quota for mbh [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/728 (https://phabricator.wikimedia.org/T389733) (owner: 10dcaro) [09:59:39] (03update) 10fnegri: maintain-kubeusers: increase memory quota for mbh [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/728 (https://phabricator.wikimedia.org/T389733) (owner: 10dcaro) [10:00:00] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.toolforge.component.deploy for component maintain-kubeusers (T389733) [10:00:04] T389733: Increase RAM quota for mbh tool - https://phabricator.wikimedia.org/T389733 [10:07:27] (03merge) 10fnegri: maintain-kubeusers: increase memory quota for mbh [repos/cloud/toolforge/toolforge-deploy] - 10https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/728 (https://phabricator.wikimedia.org/T389733) (owner: 10dcaro) [10:08:30] 10Toolforge (Quota-requests), 13Patch-For-Review: Increase RAM quota for mbh tool - https://phabricator.wikimedia.org/T389733#10685866 (10fnegri) 05Open→03Resolved a:03fnegri Quota updated: ` tools.mbh@tools-bastion-13:~$ toolforge jobs quota Running jobs Used Lim... [10:10:57] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component maintain-kubeusers (T389733) [10:11:01] T389733: Increase RAM quota for mbh tool - https://phabricator.wikimedia.org/T389733 [11:07:21] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10685966 (10Tkarcher) Any update? Or is there anything I can do to expedite the process? [11:16:52] (03open) 10arthurtaylor: Merge cache data from independent test runs [toolforge-repos/phpunit-results-cache] - 10https://gitlab.wikimedia.org/toolforge-repos/phpunit-results-cache/-/merge_requests/6 (https://phabricator.wikimedia.org/T388767) [11:38:28] (03open) 10arthurtaylor: Lint the repository, and add CI job configration to validate PRs [toolforge-repos/phpunit-results-cache] (feat/merge-cache-data-20250328) - 10https://gitlab.wikimedia.org/toolforge-repos/phpunit-results-cache/-/merge_requests/7 (https://phabricator.wikimedia.org/T388767) [11:40:56] (03update) 10arthurtaylor: Lint the repository, and add CI job configration to validate PRs [toolforge-repos/phpunit-results-cache] (feat/merge-cache-data-20250328) - 10https://gitlab.wikimedia.org/toolforge-repos/phpunit-results-cache/-/merge_requests/7 (https://phabricator.wikimedia.org/T388767) [12:57:07] 10Toolforge (Quota-requests): Increase RAM quota for mbh tool - https://phabricator.wikimedia.org/T389733#10686326 (10MBH) @dcaro is it possible to run one-off job with a high limit? I tried to add --mem:12Gi into `toolforge jobs run` request, but in doesn't work. [13:36:50] FIRING: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [13:46:17] (03approved) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/12 (owner: 10l10n-bot) [13:46:20] (03merge) 10lucaswerkmeister: Localisation updates from https://translatewiki.net. [toolforge-repos/ranker] - 10https://gitlab.wikimedia.org/toolforge-repos/ranker/-/merge_requests/12 (owner: 10l10n-bot) [14:01:50] RESOLVED: ProbeDown: Service tools-static-15:80 has failed probes (http_tools_static_wmflabs_org_ip4) - https://wikitech.wikimedia.org/wiki/Runbook#tools-static-15:80 - https://grafana.wikimedia.org/d/O0nHhdhnz/network-probes-overview?var-job=probes/custom&var-module=All - https://prometheus-alerts.wmcloud.org/?q=alertname%3DProbeDown [15:13:03] FIRING: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-21 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [16:40:30] !log fnegri@cloudcumin1001 tools START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-21 [16:45:50] !log fnegri@cloudcumin1001 tools END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-21 [17:08:03] RESOLVED: ToolforgeKubernetesWorkerTooManyDProcesses: Node tools-k8s-worker-nfs-21 has at least 12 procs in D state, and may be having NFS/IO issues - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolforgeKubernetesWorkerTooManyDProcesses - https://grafana.wmcloud.org/d/3jhWxB8Vk/toolforge-general-overview - https://prometheus-alerts.wmcloud.org/?q=alertname%3DToolforgeKubernetesWorkerTooManyDProcesses [17:21:37] 10Wikibugs: Filter out changes that only set/removed the reviewer - https://phabricator.wikimedia.org/T390313 (10bd808) 03NEW [17:41:06] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10687163 (10bd808) >>! In T389540#10685966, @Tkarcher wrote: > Any update? Or is there anything I can do to expedite the process? The software is unlicensed. Historically that has been the stopping p... [18:30:52] 10wikitech.wikimedia.org, 06serviceops-radar, 06SRE, 13Patch-For-Review, 07SRE-Unowned: Redesign wikitech-static - https://phabricator.wikimedia.org/T376400#10687257 (10Andrew) I've worked on this a bit more; here's what we have today: **The good** * http://ec2-52-23-161-9.compute-1.amazonaws.com * Tha... [19:31:54] 06cloud-services-team, 14Toolforge (Toolforge iteration 18), 07Security: toolforge: we may be allowing arbitrary host path mounts in containers - https://phabricator.wikimedia.org/T386921#10687505 (10Andrew) *bump* [19:58:46] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10687564 (10bd808) I have sent an email to the #toolforge-standards-committee internal mailing list calling for discussion of this task and the more general case of adoption of any tool found to conta... [20:00:03] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10687566 (10bd808) 05Open→03Stalled p:05Triage→03Medium Marking as stalled pending further discussion of licensing concerns by the TFSC. [20:02:33] 06Toolforge-standards-committee: Adoption request for "request" tool - https://phabricator.wikimedia.org/T389540#10687573 (10bd808)