[13:32:51] 10GitLab (CI & Job Runners), 10Release-Engineering-Team, 10serviceops-collab: Migrate GitLab Shared Runners from profile::gitlab::runner to role::gitlab_runner - https://phabricator.wikimedia.org/T322409 (10Jelto) [13:43:25] 10GitLab (CI & Job Runners), 10Release-Engineering-Team, 10serviceops-collab: Self-build and publish buildkit helper images - https://phabricator.wikimedia.org/T321316 (10Jelto) In the last IC sync meeting we discussed that it makes more sense to add the self-hosted `dockerfile/copy` image to [production-ima... [15:06:53] runner-1021.gitlab-runners.eqiad1.wikimedia.cloud appears to be out of space? Seen on https://gitlab.wikimedia.org/repos/releng/cli/-/jobs/28118 [15:07:34] And runner-1029.gitlab-runners.eqiad1.wikimedia.cloud, see on https://gitlab.wikimedia.org/repos/releng/cli/-/jobs/28112 [15:17:40] looking [15:18:58] jelto, FYI I just ran "docker system prune" on runner-1021. It freed about 3GB but we'll need to figure out how to keep the runner-* volumes in check. [15:19:16] Also the buildkitd container has a large cache directory inside of it too. [15:20:47] I run /usr/share/gitlab-runner/clear-docker-cache on all shared Runners. It freed about 15GB on every runner. They should be usable again. [15:20:49] It seems we have to lower profile::gitlab::runner::clear_interval to something below 24h (see https://gerrit.wikimedia.org/r/c/operations/puppet/+/807103) [15:21:05] It also depends on what people are doing in their jobs. [15:21:31] I can prepare a change for that [15:21:56] It would be nice if we could clear caches less aggressively (not an all or nothing situation) [15:22:12] LRU ejection [15:22:12] which cache is that? [15:22:50] from the custom runners I had for mwcli jobs (I started using the shared runners in the past weeks for lots of things) the main disk space issues came from all of the docker images used to run the jobs [15:23:21] dancy: I guess that needs bigger disks and more quota for storage in the WMCS project. We are at 600gb of 600gb total quota [15:23:47] hrm. [15:24:14] I guess we'll have to live w/ that for the time being. [15:35:26] I opened https://gerrit.wikimedia.org/r/q/853312 as a intermediate fix. Long term reducing the caching time will not scale :) [15:35:57] dancy: do you think a dedicated task helps here? I'm not sure about the future of Shared Runners in WMCS [15:36:27] Yeah, I think it's worthwhile to have a task for this topic. [15:37:18] ok, I can open that in a sec if that works for you :) [15:37:32] Works for me. [15:41:24] 10GitLab (Administration, Settings & Policy), 10serviceops-collab: Configure a default cleanup policy for GitLab package registry - https://phabricator.wikimedia.org/T315877 (10LSobanski) p:05Triage→03Low [15:42:24] 10GitLab (Administration, Settings & Policy), 10serviceops-collab: Configure a default cleanup policy for GitLab package registry - https://phabricator.wikimedia.org/T315877 (10LSobanski) p:05Low→03Medium [15:54:39] 10GitLab, 10Data-Engineering, 10Release-Engineering-Team, 10serviceops-collab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10Jelto) 05Resolved→03Open CI builds fail again with `No space left on device`. See: https://gitlab.wikimedia.org/repos/rele... [16:17:10] 10GitLab (Integrations), 10Release-Engineering-Team, 10Wikimedia-GitHub: Mirror repositories hosted on our GitLab to GitHub - https://phabricator.wikimedia.org/T321597 (10hashar) > How to stop Gerrit from doing it: does archiving a repo in Gerrit mean it's still mirrored from Gerrit to GitHub if the GitHub r... [19:38:41] 10GitLab, 10Data-Engineering, 10Release-Engineering-Team, 10serviceops-collab: Experiencing pipeline failure due to disk-space issues - https://phabricator.wikimedia.org/T310593 (10Dzahn) on runner-1021: ` dzahn@runner-1021:~$ systemctl status clear-docker-cache.timer ● clear-docker-cache.timer - Periodi... [23:06:19] 10GitLab (CI & Job Runners), 10Release-Engineering-Team, 10serviceops-collab: Self-build and publish buildkit helper images - https://phabricator.wikimedia.org/T321316 (10dduvall) This may be blocked by {T321316} which is preventing the publishing of multi-platform images. The `dockerfile-copy` image needs t...