[00:29:29] 00:27 <+icinga-wm> PROBLEM - Disk space on gitlab1001 is CRITICAL: DISK CRITICAL - free space: /mnt/gitlab-backup 0 MB (0% inode=99%): https://wikitech.wikimedia.org/wiki/Monitoring/Disk_space https://grafana.wikimedia.org/d/000000377/host-overview?var-server=gitlab1001&var-datasource=eqiad+prometheus/ops [00:29:34] 00:28 < mutante> ^ arr. checking that [00:29:37] 00:28 < mutante> it's "just" the backups but we made changes to avoid this [00:29:40] 00:28 < mutante> the good part is.. it didn't take the service down because that's a dedicated mount [01:09:40] 10GitLab, 10serviceops: gitlab1004 - puppet cert revoked? - https://phabricator.wikimedia.org/T309259 (10Dzahn) [01:09:58] 10GitLab, 10SRE, 10serviceops: gitlab1004 - puppet cert revoked? - https://phabricator.wikimedia.org/T309259 (10Dzahn) [01:12:07] 10GitLab, 10Data-Persistence-Backup, 10serviceops, 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) [01:12:18] 10GitLab (Infrastructure), 10SRE, 10serviceops: gitlab1004 - puppet cert revoked? - https://phabricator.wikimedia.org/T309259 (10Dzahn) [01:12:52] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) [01:13:30] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: gitlab-restore: version detection fail / restore fail - https://phabricator.wikimedia.org/T308089 (10Dzahn) [07:28:08] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10jcrespo) Yesterday's gitlab full backup was of only 42KB FYI. I would consider that a backup failure. ` id: 445024, ts: 2022-05-26 05:... [16:57:14] mutante: thanks for looking at that last night. offhand i have no idea why backup size would have spiked, but i'll poke around... [17:06:24] /var/opt/gitlab/gitlab-rails/shared/packages is smaller than it was before... [18:08:57] it kind of looks like we have ~14G of stale backup data in gitlab1001:/srv/gitlab-backup, although i don't expect that would affect size of backups going to /mnt/gitlab-backup. [18:40:00] brennen: thank you! I will get back into that soon (and also about the status of the Bacula backup and the rsync jobs) [19:16:24] brennen: today's random thought -- what do we need to add to the account blocking hooks at wikitech to make it lock gitlab accounts similarly to the way that we lock gerrit and phabricator accounts today? [19:16:50] * bd808 will look for a phab task and make a new one if needed [19:17:34] i think we have a relevant task, one sec [21:17:31] 10GitLab (Infrastructure), 10SRE, 10serviceops: gitlab1004 - puppet cert revoked? - https://phabricator.wikimedia.org/T309259 (10Dzahn) 05Open→03Resolved a:03Dzahn Notice: /Stage[main]/Ferm/Service[ferm]/ensure: ensure changed 'stopped' to 'running' (corrective) Info: /Stage[main]/Ferm/Service[ferm]: U... [23:14:02] 10GitLab (Infrastructure), 10SRE, 10serviceops: gitlab1004 - puppet cert revoked? - https://phabricator.wikimedia.org/T309259 (10Dzahn) Now using this machine for https://gerrit.wikimedia.org/r/c/operations/puppet/+/800308 and setting it active in netbox. [23:17:58] 10GitLab (Infrastructure), 10serviceops, 10Patch-For-Review: bring new gitlab hardware servers into production - https://phabricator.wikimedia.org/T307142 (10Dzahn) set gitlab1004, gitlab-runner1002/1003/1004, gitlab-runner2002/2003/2004 from staged to Active status in netbox. because meanwhile they have act... [23:22:55] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) useful link to see which repos use the most space, provided by Brennen: https://gitlab.wikimedia.org/admin/projects?sort=storag... [23:26:40] 10GitLab (Infrastructure), 10Data-Persistence-Backup, 10serviceops, 10Patch-For-Review, 10User-brennen: Backups for GitLab - https://phabricator.wikimedia.org/T274463 (10Dzahn) >>! In T274463#7959305, @jcrespo wrote: > Yesterday's gitlab full backup was of only 42KB FYI. I would consider that a backup fa...