[00:36:09] 10GitLab (CI & Job Runners), 10Release-Engineering-Team, 10Patch-For-Review, 10User-brennen: GitLab runners: allowed_images patterns need to be loosened to include subdirectories - https://phabricator.wikimedia.org/T310535 (10brennen) Cloud runner version of above change: https://gitlab.wikimedia.org/repos... [10:27:49] joal reported some disk space issue on some gitlab runners [10:27:58] Hi folks [10:27:59] :] [10:28:08] I think jelto will be able to assist with the disk space issue [10:28:20] Here I join to ask for help [10:28:28] thanks hashar [10:28:50] We're facing dask-space issue as hashar mentioned on gitlab pipelines [10:29:50] joal: and may you please file a task about it? [10:29:51] :) [10:29:56] I can do that [10:31:52] and of course I can't find the monitoring dashboard for the gitlab WMCS instances bah [10:32:45] 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10JAllemandou) [10:33:03] here it is --^ [10:33:12] I won't bother more :) [10:33:39] https://grafana-labs.wikimedia.org/d/000000027/project-health?orgId=1 doesn't load for some reason :D [10:35:22] 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10JAllemandou) [10:35:57] tldr there is no more monitoring on WMCS instances [10:36:08] cause we relied on Diamond which is no more available on bullseye instances [10:36:16] we have the same issue with the `integration` instances [10:42:49] :( [10:48:54] 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10hashar) a:03hashar That is more or less a recurring issue as I understand it. The WMCS instances are in the `gitlab-runners` WMCS project, there is no monitoring for them. Diamond has been phased ou... [10:49:14] joal: so the story is that gthe gitlab runners accumulate docker volumes [10:49:44] might be some caching system based on the volume names, then they consumed 25G out of 40G of disk space so I went with a `docker volume prune` [10:49:51] whole details at https://phabricator.wikimedia.org/T310593#8002019 :] [10:50:59] Awesome hashar [10:51:19] given I don't know anything about the runners caching system [10:51:21] hashar: I assume the pruning could be scheduled at regular intervals [10:51:23] some other instances will surely fail [10:51:32] right [10:51:33] some prune yes [10:51:38] and a caching system that scales :D [10:51:52] or at least that is distributed [10:52:03] but I don't know whether docker volumes can be offloaded to a distributed file system [10:52:42] hashar: this is a discussion btullis will be interested in :) [11:00:37] will side track that in -analytics :D [11:01:13] acl :) [11:46:13] 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10hashar) 05Open→03Resolved