[00:34:57] * bd808 off [09:09:27] morning [09:10:01] o/ [09:10:34] * arturo waves from kubecon [09:36:01] dcaro: hey, blancadesal just found that we seems to be missing charts/images in harbor for current-versions of toolforge-deploy [09:36:19] they seems to be superseded by new charts/images of MRs [09:36:38] can you elaborate? [09:36:48] we may need to tune the storage policy for artifacts [09:36:56] a bit more? [09:37:11] slavina says we are only keeping 10 artifacts in harbor for a given project (i.e, jobs-api) [09:37:15] as in, for production, everything should have the exact same version that's defined in toolforge-deploy [09:37:31] and should be coming from tools-harbor [09:37:46] for toolsbeta, we should have a similar thing, but should be coming from toolsbeta-harbor [09:38:06] lima-kilo pulls from toolsbeta, and is not fnding the latest chart-version there [09:38:09] for local, it can be anything (I try to leave it as toolsbeta has, but sometimes there's not) [09:38:53] okok, that's way less scarier than "we are missing images in harbor" xd [09:39:10] can you open a task with the details? [09:40:57] hmm, now that we don't need to manually change the toolforge-deploy code to deploy MRs in lima-kilo, we can have the same images in toolsbeta and local (no need to put MR images anywhere) [09:43:24] can you elaborate? [09:45:53] dcaro: when setting up lima-kilo from scratch, it's trying to get the chart version from toolforge-deploy, but as we're only keeping the last 10 artifacts, that chart may or may not be there any longer (because each MR adds it's own artifacts) [09:46:36] blancadesal: which project? that means that the local.yaml* was not bumped correctly by the script that creates the MR [09:46:59] jobs [09:47:09] I redeployed the whole lima-kilo yesterday without issuos (iirc) [09:47:14] *issues [09:47:16] let me look [09:48:35] arturo: there's three value files, local, toolsbeta and tools, tools uses tools-harbor and the image from the latest merge to the project, toolsbeta uses the same thing, but coming from toolsbeta-harbor, and local should be using the same as toolsbeta (I'd be easy to get convinced that it should use tools-harbor instead) [09:49:10] this is the one it tries to download: 0.0.269-20240315104203-49ed38c0 [09:49:11] dcaro: I think the problem is toolforge-deploy references charts/images no longer exist in harbor [09:49:12] until not long ago, the way to deploy an MR was to change manually the local.yaml file to pull the MR generated image/chart, that might mean creating a commit with the change [09:49:57] now we have toolforge_mr_deploy that does it on the fly, so no need to commit MR-specific changes in the local files [09:50:04] blancadesal: from toolsbeta-harbor? [09:50:42] https://www.irccloud.com/pastebin/s0F19sN3/ [09:51:12] exactly [09:51:19] that chart version doesn't exist in the repo [09:51:22] https://usercontent.irccloud-cdn.com/file/jJOJ6oBv/image.png [09:51:34] it's missing the tag immutability rules [09:51:35] https://usercontent.irccloud-cdn.com/file/TMoigRam/image.png [09:53:55] Imo, the problem is that we're retaining only the 10 latest artifacts, and right now for jobs that's only the currently open MRs [09:54:41] with immutability, we would be keeping the latest 10 MR artifacts + the 10 last merged artifacts (like prod), maintain-harbor takes care of that [09:55:13] ok [09:55:16] there's always going to be a number of MRs that will displace the merged artifacts, unless you treat them differently [09:55:38] that makes sense [09:56:06] so, could we maintain those cleanup rules in code somewhere? [09:57:29] maintain-harbor would be the place, though those policies only apply to the toolforge project [09:58:40] (currently handles policies for tool projects) [09:59:51] yeah, maybe configure those policies at harbor deploy time? [10:00:06] francesco is suggesting we use opentofu hehe [10:00:19] i'll create a task [10:03:00] T360509 [10:03:01] T360509: [harbor] Find a way to manage toolforge project policies with code - https://phabricator.wikimedia.org/T360509 [10:03:52] arturo: the problem on doing it at deploy time is that you can't change them later (ex. change the quota of a project, number of artifacts retained, etc.) [10:13:43] I wonder if we can do via puppet [10:14:43] BTW what do you think about updating the harbor public FQDN? from tools-harbor.wmcloud.org to `harbor.svc.toolforge.org`? [10:17:18] it needs some changes in the harbor config (it has the external name defined there), and a restart [10:17:28] otherwise you get weird errors when trying to log in [10:17:36] (anonymous still works iirc though) [10:17:57] what would be the toolsbeta one? [10:21:53] blancadesal: for now I've manually pushed the missing tags to toolsbeta (so lima-kilo should work again) [10:22:19] dcaro: The toolsbeta project uses *.svc.beta.toolforge.org for an equivalent purpose. (from https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS#*.svc.toolforge.org) [10:23:14] They should not be removed again https://www.irccloud.com/pastebin/DTQj9RqF/ [10:24:20] would be nice to have *.svc.toolsbeta.org (easier to remember and make parallels) ¯\_(ツ)_/¯ [10:24:44] dcaro: thanks :) [10:33:51] I just bought toolsbeta.org domain, just in case [10:34:57] :)) [10:44:22] now resell it to the WMF for $1K USD -- good business eye :-P [10:46:55] I'm not sure if I'd be allowed to donate it either xd, I can change the NS entries at least [11:07:57] taavi: can I merge https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/60 ? I want to start merging things as PRs start piling up [11:08:21] dcaro: please do, sorry about forgetting that [11:08:28] np [12:16:55] * dcaro lunch [13:43:32] andrewbogott: I am seeing puppet failures due to this: [13:43:33] Error: /Stage[main]/Puppetmaster::Scripts/File[/usr/local/bin/puppet-facts-export]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/profile/puppetserver/scripts/puppet-facts-export-nodb.sh [13:43:39] seemingly related to your patch? [13:44:14] yes, probably related. Looking... [13:44:25] on cloud-puppetmaster-03.cloudinfra.eqiad.wmflabs for example [13:45:03] yeah, the puppetmaster profile pulls that file from the puppetserver subdir -- I thought there were two copies. [13:45:06] I'll just restore the file [13:45:20] sgtm [16:01:02] neat, my cookbook worked correctly on the first try :D [16:06:09] \o/ [17:19:56] * dcaro off [17:19:59] cya tomorrow [17:53:36] * bd808 lunch [23:01:54] My reading of https://redis.com/blog/redis-adopts-dual-source-available-licensing/ is that we need to rip Redis out of Toolforge when the current versions we can get from Debian go EOL. [23:02:12] "Under the new license, cloud service providers hosting Redis offerings will no longer be permitted to use the source code of Redis free of charge." [23:03:08] unless someone does a fork, yes. :/// [23:07:21] "Third, changing the license term to protect one’s brand and IP has become a natural part of the evolution of many open source projects in order for commercial entities which back those technologies to survive and thrive as businesses." -- barf [23:09:14] Do yinz think the same would be needed for quarry? [23:11:04] At some point yes, if only because neither of the source-available licenses they have chosen is OSI approved. So at some point in the future there will not be a release compatible with the WMCS TOU. [23:11:42] Quarry uses Redis as a shared queue if I remember correctly? [23:12:22] I'm not sure what a shared queue is. It's used for login information. Not sure if it is used beyond that [23:12:36] At any rate I'll open a ticket [23:13:11] ah, I thought I remembered that the worker fan out in Quarry was Redis backed. Front end pushed to a queue, backends pop and process. [23:13:38] It might do that, I've never looked at that bit of it [23:15:32] We have a lot of time before the current Debian released Redis versions under the BSD license are EOL. A new solution will arise; it's just annoying at this point to keep seeing the same rug pull over and over and over [23:17:25] To their credit the Redis folks are clearly stating that they are not Open Source software anymore because they have broken from OSI approved licensing. 500 demerits, but one credit back [23:25:19] https://github.com/Snapchat/KeyDB claims to be a drop in replacement -- "KeyDB maintains full compatibility with the Redis protocol, modules, and scripts." [23:26:07] >A fork allows us to explore this new development path and implement features which may never be a part of Redis. [23:26:11] heh, they definitely won't be now [23:26:17] >KeyDB keeps in sync with upstream Redis changes, and where applicable we upstream bug fixes and changes. [23:26:18] Or that [23:28:34] https://github.com/Snapchat/KeyDB/issues/420 [23:28:35] >https://github.com/Snapchat/KeyDB/issues/420 [23:28:36] >can you please confirm (or deny and if whats the timeline) KeyDB is dead? [23:59:16] * bd808 runs a local overnight test of wikibugs backed by keydb [23:59:25] * bd808 off