[04:49:19] _joe_: I've left an analysis on T346971 and T349376. Would appreciate feedback on the approach and/or other ideas you might have. [04:49:20] T346971: Uncaught ConfigException: Failed to load configuration from etcd - https://phabricator.wikimedia.org/T346971 [04:49:20] T349376: EtcdConfig using stale data: lost lock in /srv/mediawiki/php-1.42.0-wmf.1/includes/config/EtcdConfig.php on line 218 - https://phabricator.wikimedia.org/T349376 [06:17:23] <_joe_> Krinkle: thanks, that's excellent work (I only read the analysis on the latter) [08:04:28] Emperor: moved the discussion to https://phabricator.wikimedia.org/T350658 [08:22:48] godog: cool, will update there in a bit, but I think my conclusion from yesterday's discussion was a) file bug against apt b) if a suitable CI variable (say BACKPORTS=yes) is set, adjust priority of -backports to 500 [08:23:06] on-call notwithstanding, can hopefully get you an MR to look at today [08:27:18] ack [08:38:11] I'm going to start merging a few netbox script related changes, please hold on any re-image if possible in the next 10min (not a big deal if you can't) [09:24:26] that's done btw [09:24:38] now I'm removing some old IPs and the dns cookbook might complain for a bit [09:53:01] Emperor: re: backports in wmf-debci, I was thinking since the suite picking mechanism is based on changelog entries now, something intuitive would be to keep that and have e.g. buster-backports images so using 'buster-backports-wikimedia' (or buster-backports) in changelog would do the right thing, what do you think ? [09:54:11] <_joe_> godog: I'm not sure that maps 1:1 what we do now [09:55:08] <_joe_> we do have things that are in another suite that do need to use backports; I'd like to be able to control the behaviour using an env variable in my build declaration [09:55:38] mhh do you have an example? [09:55:56] the build declaration in this case is the CI config, i.e. it would apply to all suites [09:59:54] enabling backports regardless of suite might be ok though, not a huge deal I'd imagine [10:07:28] godog: I think that would be wrong - we don't in our apt repo have a wikimedia-bullseye-backports suite (for example) [10:07:45] and the suite specified in d/changelog should match the suite it's being built for [10:08:16] We have wikimedia-bullseye (regardless of whether the package is a backport, a local-build, ...) [10:10:19] that's fair yeah [10:10:46] If you wanted more control we could have suite-specific variables to specify backports use, but that seems like more complexity than we want? [open to argument on that point, changing the code to look for if [ ${SUITE}-BACKPORTS ] isn't hard, but then your variable is called bookworm-BACKPORTS which is a bit ugly [10:11:44] ok let's go with USEBACKPORTS, seems the path of least resistence [10:15:38] kicked off a new build at https://gitlab.wikimedia.org/repos/sre/alerts-triage/-/pipelines/30821 [10:16:07] if it turns out we do want suite-specific as well we can do that subsequently. [10:17:18] I think the shell used is bash, so we could even do ${SUITE^^}-BACKPORTS. I think. [10:18:15] godog: Hm, something still not happy [10:18:50] [did you set a suitable variable? I don't think the ui tells me] [10:20:27] Emperor: I did set it in the project variables, though it defaulted as protected, I've unprotected it and kicked off another build [10:21:02] https://gitlab.wikimedia.org/repos/sre/alerts-triage/-/pipelines/30822 that is [10:23:22] which also didn't work, the variable I've set shows up in 'variables' at https://gitlab.wikimedia.org/repos/sre/alerts-triage/-/settings/ci_cd [10:25:22] it might be a case of PEBKAC and I'm not seeing it [10:26:09] FWIW I'm not blocked by this at the moment, I've built the package on build2001 for the moment [10:29:06] godog: I think the right answer is to use WIP.yml and shove a bunch of debugging in to figure out what apt is unhappy about. Shall I have a look at that and see what I find, or do you want to? [10:29:25] [I can repro the failure, it's not an error in your variable setting] [10:30:24] Emperor: ack, thank you, happy to assist though currently I don't have the bandwidth to debug further [11:20:42] ah, got it - echo was defaulting to not expanding \n [11:26:18] MR with fix now available :) [11:30:49] g.odog: I'll put the CI/CD for alerts-triage back to builddebs.yml once that MR is merged [11:56:22] Emperor: nice, thank you for the debugging [11:56:51] this is probably the universe telling me I should be using printf instead of echo in future... [14:59:37] Finally latency have recovered https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red?orgId=1&from=now-24h&to=now&refresh=1m [15:14:09] apropos the deficiency in apt's priority system, I've filed https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055507 (perhaps too much detail there, but I hope it's clear) [15:23:56] btullis: https://phabricator.wikimedia.org/T347938 in case you missed it - as you've been working on similar tickets today [15:24:25] marostegui: Excellent, thanks. [16:08:32] <_joe_> cwhite: I have a series of patches to install the statsd exporter on k8s as a sidecar; starting at https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/972340/2 [16:13:51] <_joe_> I'd like to merge them tomorrow, so your mediawiki-config patch can also be tested on k8s [16:15:50] not sure why but my puppet-merge on puppetmaster1001.eqiad.wmnet is timing out, I'm investigating... [16:20:23] never mind, pebkac problem, I ran puppet-merge on puppetserver1001, which doesn't work [16:20:45] it timed out instead of yelling at you? :) [16:21:05] yes, and that should probably be fixed [16:21:24] _joe_: thanks! patch series looks good to me :) [17:02:35] jhathaway: re puppet-merge - https://gerrit.wikimedia.org/r/c/operations/puppet/+/972423 [17:03:12] thanks jbond! [17:10:17] np [17:50:56] I have a question for those that maintain wmf python packages about upgrading them to bookworm [17:51:16] I am having issues maintaining compatibility with CI, given it runs on buster images [17:51:35] (or at least, it does by default) [17:52:26] I can update my tox to run ok locally, but I am unsure how to maintain compatibility at the same time with buster [17:53:59] if the problem is just tox it's easy to maintain compatibility even between tox 3 and tox 4 [17:54:22] if yoou're talking about the compatibility of your python code, depends on which python version you make tox run [17:54:26] volans: could you give an example on how you did it, I would love to read it [17:55:05] I mostly copied and paste what you did 0:-D [17:55:16] AFAIK https://gerrit.wikimedia.org/r/plugins/gitiles/operations/cookbooks/+/refs/heads/master/tox.ini works fine in CI and I have tox4 locally [17:55:39] same for spicerack, etc.. (I don't recall if I have migrated them all already) [17:55:50] sorry gotta run right now, I can reply eventyally later on [17:57:51] technically it is not tox, but to tooling I am using may be quite old [17:57:55] *the