[09:36:21] hi [09:36:43] tools-legacy-redirector-2.tools.eqiad1.wikimedia.cloud has been emailing about puppet issues for a while [09:42:05] that would be david, working on puppetization [09:44:33] that's me yep, I can enable puppet as currently it does not really touch that config file xd [10:31:28] quick review when someone has a moment between tasks: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1126511 [10:55:31] another one https://gerrit.wikimedia.org/r/c/operations/puppet/+/1126520 [10:57:37] dhinus: for the k8s 1.29 golang cli updates, the mrs you plan to deploy before the upgrade right? (so I test with k8s 1.28) [10:58:46] yes, I would test all of them with 1.28 [10:59:21] ack [11:00:03] if everything works, I would deploy all of them to toolsbeta and then tools while still on 1.28, but let me know if have any concerns [11:02:36] not really, I think it'll go ok, just wanted to test in lima-kilo [11:29:22] in -cloud there is a request to rename a developer account, what is the current policy about that? is it even a WMCS thing anymore? [11:36:43] I have no idea, but I suspect that it is not possible to rename an account [11:39:35] https://wikitech.wikimedia.org/wiki/SRE/LDAP/Renaming_users [11:41:51] there you go [11:42:48] I feel like we should have a help/FAQ page for end users saying this [11:44:01] we may want to add a note here: https://www.mediawiki.org/wiki/Developer_account [12:07:14] added a note there ^ [12:11:39] thanks [12:17:29] of course I messed up the translate tags :/ [13:47:13] dhinus, hey, if you have a few spare moments, we could use your help [13:47:53] turns out, chuckonwu is having some difficulties getting inside the toolsbeta bastion by ssh, and it seems their MacOSX system is somehow refusing to resolve the bastion FQDN [13:48:12] hey, sure, let me see if I can reproduce [13:48:18] dig @8.8.8.8 +short toolsbeta-bastion-6.toolsbeta.eqiad1.wikimedia.cloud [13:48:24] this works in their terminal [13:48:32] works on mine too [13:48:57] dig toolsbeta-bastion-6.toolsbeta.eqiad1.wikimedia.cloud [13:48:58] and I can ssh to that hostname [13:49:00] what about that one? [13:49:14] toolsbeta-bastion-6.toolsbeta.eqiad1.wikimedia.cloud. 60 IN A 172.16.7.63 [13:49:35] maybe it is their network provider [13:49:49] weird [13:50:18] you could try going to the macos settings, network, wifi, details, DNS [13:50:36] and add 8.8.8.8 as a DNS server there [13:50:48] this should override whatever DNS your provider uses [13:52:13] chuckonwu: ^^^ [16:26:24] dhinus: fyi. just received the reprepro emails with the new 1.29 k8s packages being imported :) [16:30:02] dcaro: ack, I did run the update script manually, not sure if they would have updated automatically [16:30:19] I think they do yep (only for updates, not removals) [16:30:24] ack [17:07:39] * arturo offline [17:32:05] quick review https://gerrit.wikimedia.org/r/c/operations/puppet/+/1126597 [17:37:09] dcaro: I've deployed all "bump to..." patches in toolsbeta using the deploy cookbook [17:37:21] nice [17:37:35] it's a bit tricky to keep track of all the versions, maybe it's easier to deploy only one component, then deploy the same component to tools, then merge the MR [17:38:21] otherwise now I have a bunch of updated components, but there are also other bumped versions that were unrelated to my changes and I'm not sure what combination I'm testing [17:39:10] oh, so something was merged while you deployed in toolsbeta? [17:39:39] hmm I'm not sure, that's my point :) how do I check? [17:39:44] hmm, yep, you'd have to try to do that test "atomically" kind of, or it gets complicated [17:39:55] git log on toolforge-deploy (or merged MRs in the ui) [17:40:10] toolforge_get_versions will compare "one branch" of toolforge-deploy, so I will always see a lot of yellow/red [17:40:26] yep, if you deployed a lot of things from different branches [17:41:32] yep, so I'm thinking maybe I should only do one at a time. but even there, what if you're testing a separate update to jobs-api for example, and you run the deploy script on "your" bump branch at the same time of my deploy script? [17:41:48] I guess in that case seeing something red in toolforge_get_version would at least show the "conflict" [17:42:00] and I could check more easily [17:42:31] the deploy cookbook will complain if someone is deploying at the same time (but it will not if they finished already) [17:42:37] or you mean using ./deploy.sh? [17:43:01] no I mean with the cookbook, good to know it will complain [17:44:09] hmm, builds-builder seems to be in a very old version in toolsbeta [17:44:45] yep I was also confused by that! [17:44:52] should I force a deployment from main? [17:45:00] wmcs-metrics is also old [17:45:57] helm is weird sometimes, like with calico there, but it should have updated the version of this one, it has changes and everything [17:46:47] maybe helm was not run on that component for a while? [17:47:28] we usually run it with the deploy cookbook, but you can merge something without running the cookbook [17:47:28] this suggests otherwise :/ [17:47:29] https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/701#note_129346 [17:47:48] hm [17:48:28] can you try to deploy it manually from main? (either with the cookbook or ./deploy.sh from a clone of toolforge-deploy) [17:48:33] on toolsbeta I mean [17:49:03] yep let me try [17:51:16] what's the right way to run toolforge_get_versions manually? I have a clone of toolforge-deploy in toolsbeta-test-k8s-control-10 [17:51:44] utils/toolforge_get_versions.sh [17:51:54] I tried running as root /home/fnegri/toolforge-deploy/utils/toolforge_get_versions.sh and it's not working [17:52:03] you run it as your user [17:52:10] it uses `helm-sudo` [17:52:18] ah ok [17:52:26] root does not have kubectl configured iirc [17:52:31] as my user: Get "http://localhost:8080/version": dial tcp 127.0.0.1:8080: connect: connection refused [17:52:39] because I also don't have kubectl configured [17:52:41] hmm, where? [17:52:45] toolsbeta bastion? [17:53:00] no control, maybe that's the issue [17:53:06] toolsbeta-test-k8s-control-10 [17:53:15] yep, you have to run it from the bastion [17:53:19] (same as the tests) [17:53:39] ok [17:54:07] it's working [17:54:10] now the cookbook [17:54:19] sudo cookbook wmcs.toolforge.component.deploy --cluster-name toolsbeta --component builds-builder --git-branch main [17:54:48] sounds good [17:55:13] it's still on the old version [17:55:21] hmm, what's the output? [17:56:18] INFO: deployed builds-builder on toolsbeta from branch main [17:58:00] let me do a manual ./deploy.sh, though I suspect it will not do much [17:58:23] maybe helm fails and the error is swallowed up? [17:58:29] yep [17:58:32] https://www.irccloud.com/pastebin/tdfrgqo7/ [17:59:07] I think it works, as some config changes are actually being changed (in previous deploys), otherwise the new stuff would fail [17:59:40] hmm, I think helm just ignores upgrades that have no content, and it might not have had any new content for a while this one [18:00:55] yep, it had no changes to the contents of the chart, just the harbor setup scripts (not used in the chart) [18:01:04] so it's ok [18:01:07] annoying though [18:01:34] (just checked the last few releases https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-builder/-/releases) [18:02:07] hmm, so we had no content changes since november... surprising [18:04:06] ah I see [18:04:16] so helm is just doing a no-op [18:04:24] yep, and not recording it anywhere [18:04:34] (helm history does not show it either) [18:04:37] annoying [18:05:16] I have to log off, let's catch up tomorrow [18:06:47] 👍 [18:06:50] me too, cya!