[06:35:42] <_joe_> ottomata: I don't want to know what image requires 50 GB of space to build [08:40:07] <_joe_> moritzm, jbond build2001 fails to clean up the package building chroots it seems [08:40:13] <_joe_> CRITICAL - degraded: The following units failed: package_builder_Clean_up_build_directory.service [08:40:19] <_joe_> firing since 17 days [08:42:22] I'll have a look in a bit [08:51:05] prospector is faling for me to run on operations/cookbooks (any branch), anyone has the same issue? (TypeError: string indices must be integers) [08:51:10] ^it makes ci fail too [08:54:56] dcaro: checking [08:57:04] probably pylint release [08:57:55] yep, pinning pylint to 2.15.5 works [08:58:32] yes that's clearly the newest pylint in prospector [08:58:37] 2.15.6 works too [08:59:10] dcaro: you're lucky: "... released this 18 minutes ago" [08:59:13] the prospector release [08:59:29] xd [08:59:57] let me pin prospector to 1.7.7 and we wait that they fix all the issues in 1.8 [09:02:54] ah nom they relaxed the deps so it's no prospector fault's this time [09:03:09] <_joe_> volans: I will avoid kicking you while you're down [09:03:10] so less lucky dcaro , released 2 days ago :D [09:03:34] _joe_: it's pylint this time... so you can't :D [09:03:37] <_joe_> dcaro: don't believe his lies, I have experienced that every time I tried to patch either cookbooks or spicerack [09:04:10] <_joe_> it's 15 different linters, with picky but different dependencies, changing rules constantly [09:04:18] give me nightly CI builds and I'll fix them before you hit them [09:04:40] <_joe_> you can write a script that does that [09:04:41] and brittle shared deps xd [09:04:58] <_joe_> so at any given time you can have 5 failure modes unrelated to your change that need fixing [09:05:07] <_joe_> that surely encourages contributing to spicerack for instance! [09:05:38] so, if this was avoid kicking me... [09:07:56] <_joe_> volans: I was avoiding until you provoked me with pylint :P [09:08:47] <_joe_> I mean I gave you this feedback in serious form too, but given how much you love all your precious linters, it's unavoidable to turn to sarcasm :D [09:10:27] dcaro: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/862830 [09:15:24] dcaro: merged, sorry for the trouble, I'll be opening a bug upstream as I don't see one right now [09:27:26] np, thanks! [09:52:17] dcaro, _joe_: so pylint has been released 2 days ago with a change on a private method and prospector uses that private method (both repos are in the same GH organization). Prospector has fixed that bug in master and also release 1.8.0 with the fix, but not yet on PyPI. [09:56:50] (to be precise prospector extends a pylint class and overrides a private method that has changed signature) [10:06:17] I'm seeking kind souls for a quick review of https://gerrit.wikimedia.org/r/c/operations/puppet/+/862838 -- thank you ! [10:07:30] looking [10:08:59] cheers [10:10:17] host is basically ready for decom, I'm giving it some grace until next week in case issues pop up [10:11:53] +1d [10:15:04] thank you! [10:41:29] volans _joe_ FYI I'd like to go ahead with setting thanos-web service to production with https://gerrit.wikimedia.org/r/c/operations/puppet/+/862843 [10:41:58] <_joe_> godog: I think it would be wise to wait for monday :P [10:42:05] <_joe_> jokes aside, thanks for the heads up :) [10:42:16] same [10:42:54] haha! cheers, will go ahead [10:42:57] "I think it would be wise to wait for the new guy to be on-call" THANKS. [10:43:09] * volans turns off the pager [10:43:19] lol [10:43:45] ok merging, hold my probe I'm going in [10:43:48] Y'all disgust me, I'm getting a coffee [10:53:13] all good btw, I'll go ahead with https://gerrit.wikimedia.org/r/c/operations/dns/+/862356 too [12:02:22] If someone has time to give a quick +1 for a new periodic job on mwmaint: https://gerrit.wikimedia.org/r/c/operations/puppet/+/861813/ [12:10:01] Can I get a quick review https://gerrit.wikimedia.org/r/c/operations/puppet/+/862860? [12:11:53] marostegui: 1+ed, assuming you got the key securely ;) [12:12:07] volans: Still working on that, not pushing it until it is done :) [12:12:28] :) [13:22:58] _joe_: i'm not sure either, it fails pretty early on in the apt-get update and install steps, even after I prune everything (maybe I didn't prune correctly?) Thte resultitng images are < 1G [14:00:03] <_joe_> ottomata: what are you doing, exactly? [14:02:43] _joe_: still very very WIP, still learning things, so be nice OKAY?!?!? [14:02:43] https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356 [14:03:08] <_joe_> me? nice? [14:03:10] <_joe_> aha [14:03:15] hhah [14:03:51] <_joe_> oh dear that's a lot of stuff, where does this fail specifically? [14:04:45] well its not failing anymore since I increased my docker disk size, but IIRC it was faliing at this apt_install step https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#15 [14:05:10] <_joe_> uh that doesn't seem probable [14:05:25] <_joe_> ok I must ask [14:05:27] yeah, doesn't seem like a particularly bad apt_install, [14:05:32] <_joe_> why buster and not bullseye? [14:05:37] (btw I am in a meeting rn...) [14:06:16] <_joe_> ohhh sorry [14:20:38] _joe_: when i started i had some issue with pulling bullsye, but i was probably just doing something wrong. but also, i think pyflink works better with python 3.7...but if we can do bullsye would def prefer that [14:20:59] <_joe_> ack makes sense to start with buster [14:21:12] <_joe_> also, ugh, a python package that is not forward compatible? [14:21:14] will revisit and try bullsye again [14:21:21] i think it might work with 3.9 [14:21:22] ? [14:22:46] ya, it does say 3.9 is fine: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/python/installation/, which i think is what bullsye has so maybe its fine? [14:24:05] _joe_: btw your patch failure... that's what you get after your earlier sarcasm ;) https://github.com/PyCQA/prospector/issues/545 [14:24:59] <_joe_> volans: I just got a line over 120 chars, because that file is not formatted with my standard formatter (black) [14:25:12] <_joe_> and I'm not used to check myself anymore :P [14:25:19] and prospector failure ;) [14:26:00] <_joe_> ah yeah well, it means you will have to merge that patch when prospector is fixed, unless... I just don't care and remove the stupid CI vote [14:33:35] <_joe_> volans: seriously, it's not acceptable that it happens so often that it's impossible to merge changes in these repositories [14:33:41] <_joe_> I joke about it but it's just wrong [14:44:48] yes it's unpleasant at times but I don't think it's fair to say that is impossible to merge things, there have been 95 commits merged into this repo in the last 3 months. [14:45:27] I don't have the bandwidth to freeze all test deps for each repo I manage and then every month go over all of them to see what needs to be updated at the moment [14:45:31] happy to hear better alternatives [14:45:47] <_joe_> remove prospector. [14:46:00] <_joe_> for that repo specifically, I think it adds about zero. [14:46:51] it runs pylint [14:47:06] that does much more than flake8 and I think it's useful [14:47:33] ofc we can setup pylint separately, but pylint is a good source of the issues [14:48:30] anyway I'm sending aptch to unblock [14:48:44] <_joe_> we can just ignore CI [14:48:49] <_joe_> we know that patch would pass [14:59:03] all done and rebased your patch [17:17:03] jbond: did you change anything on the idp manifests? (puppet started failing on the colud one complaining about apereo_cas.production.oidc_endpoint) [17:17:36] dcaro: yes sorry let me send a quick fix [17:17:53] 👍 [17:17:56] thanks! [17:18:32] this looks like it's missing a quote? https://gerrit.wikimedia.org/g/operations/puppet/+/bac601c5edd4eae7e4bd1c533e5ce81192742b7b/hieradata/cloud/eqiad1/sso/common.yaml#8 [17:22:59] dcaro: nice spot, here is the full fix https://gerrit.wikimedia.org/r/c/operations/puppet/+/863005 [18:08:56] small fix for a UBN issue on the api gateway if anyone has a sec https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/863013/ (envoy config but very simple to grok) [18:20:01] hnowlan: just saw this now but yep, fix LGTM [18:20:24] rzl: thanks! [21:38:09] hmmm, decom cookbook is hiccupping on mw1312: https://phabricator.wikimedia.org/P42203 [21:38:34] confirmed I can't talk to mw1312.mgmt via ipmitool, but ipmi-chassis on the host works fine [21:43:07] ah I see https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Reclaim_to_Spares_OR_Decommission mentions the workaround, I'll just continue the cookbook and then proceed that way [22:09:52] Hello!! I'm working on a change to add a user and I get the following error in the CI: [22:09:52] The following users are members of a group but don't exist: abartov [22:09:53] I just reviewed the production branch and the user 'abartov' is there, I also managed to validate that their ID matches what we have on LDAP. [22:09:53] Do you know why this error could be happening?? [22:10:05] This is the output of my change. :) [22:10:05] https://integration.wikimedia.org/ci/job/operations-puppet-tests-buster-docker/55482/console [22:13:46] BTW, this is my change and everything looks correct to me 🙈: https://gerrit.wikimedia.org/r/c/operations/puppet/+/860132 [22:14:37] denisse: your change accidentally moved the `ldap_only_users:` line above abartov [22:15:22] rzl: I can't believe I missed that, my bad. [22:15:41] happens all the time :) [22:16:35] * denisse loves CI. <3