[00:36:09] <wikibugs>	 10GitLab (CI & Job Runners), 10Release-Engineering-Team, 10Patch-For-Review, 10User-brennen: GitLab runners: allowed_images patterns need to be loosened to include subdirectories - https://phabricator.wikimedia.org/T310535 (10brennen) Cloud runner version of above change: https://gitlab.wikimedia.org/repos...
[10:27:49] <hashar>	 joal reported some disk space issue on some gitlab runners
[10:27:58] <joal>	 Hi folks
[10:27:59] <hashar>	 :]
[10:28:08] <hashar>	 I think jelto will be able to assist with the disk space issue
[10:28:20] <joal>	 Here I join to ask for help
[10:28:28] <joal>	 thanks hashar 
[10:28:50] <joal>	 We're facing dask-space issue as hashar mentioned on gitlab pipelines
[10:29:50] <hashar>	 joal: and may you please file a task about it? 
[10:29:51] <hashar>	 :)
[10:29:56] <joal>	 I can do that
[10:31:52] <hashar>	 and of course I can't find the monitoring dashboard for the gitlab WMCS instances bah
[10:32:45] <wikibugs>	 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10JAllemandou)
[10:33:03] <joal>	 here it is --^
[10:33:12] <joal>	 I won't bother more :)
[10:33:39] <hashar>	 https://grafana-labs.wikimedia.org/d/000000027/project-health?orgId=1 doesn't load for some reason :D
[10:35:22] <wikibugs>	 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10JAllemandou)
[10:35:57] <hashar>	 tldr there is no more monitoring on WMCS instances
[10:36:08] <hashar>	 cause we relied on Diamond which is no more available on bullseye instances
[10:36:16] <hashar>	 we have the same issue with the `integration` instances
[10:42:49] <joal>	 :(
[10:48:54] <wikibugs>	 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10hashar) a:03hashar That is more or less a recurring issue as I understand it.  The WMCS instances are in the `gitlab-runners` WMCS project, there is no monitoring for them. Diamond has been phased ou...
[10:49:14] <hashar>	 joal: so the story is that gthe gitlab runners accumulate docker volumes
[10:49:44] <hashar>	 might be some caching system based on the volume names, then they consumed 25G out of 40G of disk space so I went with a `docker volume prune`
[10:49:51] <hashar>	 whole details at https://phabricator.wikimedia.org/T310593#8002019 :]
[10:50:59] <joal>	 Awesome hashar 
[10:51:19] <hashar>	 given I don't know anything about the runners caching system
[10:51:21] <joal>	 hashar: I assume the pruning could be scheduled at regular intervals
[10:51:23] <hashar>	 some other instances will surely fail 
[10:51:32] <joal>	 right
[10:51:33] <hashar>	 some prune yes
[10:51:38] <hashar>	 and a caching system that scales :D
[10:51:52] <hashar>	 or at least that is distributed
[10:52:03] <hashar>	 but I don't know whether docker volumes can be offloaded to a distributed file system
[10:52:42] <joal>	 hashar: this is a discussion btullis will be interested in :)
[11:00:37] <hashar>	 will side track that in -analytics :D
[11:01:13] <joal>	 acl :)
[11:46:13] <wikibugs>	 10GitLab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10hashar) 05Open→03Resolved