[07:41:23] good morning folks [07:41:38] elukey: good cumin! [07:41:42] deneb's root is almost full, there are some home dirs that are big, can you review them? [07:42:02] Cc: ottomata, akosiaris, godog, moritzm (top 4 in the list :) [07:42:15] marostegui: good cumin cumin to you! [07:43:27] elukey: I can free up 12M if needed [07:53:32] ahahha <3 [07:53:39] (need to run errand, bbl) [07:58:41] elukey: will do [07:59:23] I can free a whopping ... 2% [08:00:33] something allocates about 8% once a week though [08:01:59] https://grafana.wikimedia.org/d/000000377/host-overview?viewPanel=12&orgId=1&var-server=deneb&var-datasource=thanos&var-cluster=misc&from=now-30d&to=now that is [08:10:22] are we still rebuilding the docker images every week? [08:13:41] elukey: I'll clean some bits up now, but in general we're soon moving to a new build2001 host running bullseye :-) [08:54:40] <_joe_> legoktm: yes, why? [08:55:05] <_joe_> we always create a weekly version of the base images and of the production images [08:55:10] That's probably the 8% every week go.dog was pointing out [08:55:43] <_joe_> legoktm: yeah I think I wrote a job to prune images afterwards but I never pushed it [08:56:10] <_joe_> running docker image prune -a [08:57:11] <_joe_> space is getting freed indeed [08:57:53] <_joe_> build hosts should have terabytes of disk space FWIW [08:58:27] <_joe_> Total reclaimed space: 88.74GB [09:03:05] don't fill it all at once :) [09:03:24] * Emperor sometimes misses Sanger-scale storage systems [10:54:27] hello, may someone please puppet merge a couple patch I have for the WMCS integration project please. I got them both applied and they work as expected: https://gerrit.wikimedia.org/r/c/operations/puppet/+/755713 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/755948/ [15:10:58] ^ done by akosiaris ;) thank you [15:11:36] herron: merged your puppet change: switch legacy elk LVS entries to state: lvs_setup (1509b0bfd6) too [15:11:42] akosiaris: ty! [15:11:46] hashar: pas de quoi! [17:08:34] heads-up - the restbase reimage has hit a few hosts that others in SRE have hit where reimaging is impossible until the firmware is updated. There's 6 hosts that are now out of puppet but still running their base OS with restbase/cassandra online and still pooled https://phabricator.wikimedia.org/T299652 [17:37:22] <_joe_> hnowlan: you can re-add them to puppet btw [17:38:34] <_joe_> not sure you want to let puppet disabled there beyond monday [18:07:23] btullis: puppet is broken on the prometheus servers: https://gerrit.wikimedia.org/r/c/operations/puppet/+/755971 `Error while evaluating a Resource Statement, Prometheus::Class_config[matomo_mysql_eqiad]: expects a value for parameter 'port'` [20:20:49] btullis: I went ahead and reverted the patch to get puppet running again: https://gerrit.wikimedia.org/r/c/operations/puppet/+/755998 [21:02:17] cwhite: Oh thanks for reverting. I'm sorry about that. I should have run a PCC against the prometheus servers. [21:03:51] No worries, have a good weekend :)