[08:00:20] Cteam: welcome to today 🦄! Don’t forget to post your update in thread. [08:00:20] Feel free to include: [08:00:20] 1. 🕫 Anything you'd like to share about your work [08:00:20] 2. ☏ Anything you'd like to get help with [08:00:20] 3. ⚠ Anything you're currently blocked on [08:00:20] (this message is from a toolforge job under the admin project) [08:48:14] I have gone through a bunch of pahb tickets related to CLIs, and also making APIs easier to call and left some comments with thoughts [08:48:14] Of note this lead to me adding "curl" commands to the builds, envvars and jobs commands, so you can make arbitrary requests [08:48:14] I also make a kubernetes command that allows arbitrary authenticated requests to the k8s api too [08:48:14] eg. locally I can do.. tf k curl --tool wikicrowd /api/v1 --json [08:48:57] I'll likely start getting CI to build binaries in the coming days so that its easier for folks to give the CLI a go [13:42:09] Done: [13:42:11] * resolved T362868 toolforge k8s upgrade [13:42:13] ** upgraded lima-kilo [13:42:15] ** updated the wiki at https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Ongoing_Efforts/Toolforge_Upgrade_Workgroup/Upgrades_Overview [13:42:17] * T383723 clouddumps1001 temperature issues: attempted shut down [13:42:19] ** created subtask T391369 to track the issues I observed [13:42:21] * jobs-api code reviews (!153 is now merged) [13:42:23] Working on: [13:42:25] * more jobs-api code reviews [13:42:27] * restarting work on some lower-priority tasks: [13:42:29] ** T374953 replace wmcs-wikireplica-dns.py with tofu [13:42:31] ** T385885 [toolsdb] Remove apt pinning [13:50:55] Yesterday: [13:50:55] * Worked more on magnum flakiness; it's working reasonably well for me in codfw1dev [13:50:55] * Replaced cloudcontrol1005 with cloudcontrol1011 [13:50:55] Today: [13:50:55] * Reviewing hardware tasks/spreadsheets [13:50:55] * Getting new test ceph drive online T390134 [21:20:19] andrewbogott: random datapoint for you--I destroyed and recreated the deployment-prep Magnum cluster yesterday. Destroying was an adventure that actually needed manual bits via Horizon; Tofu kept timing out in several of the delete operations. Recreating "just worked" using the values previously coded into my Magnum template. I didn't try bumping to any newer version. [21:22:07] bd808: My current theory is that those intermittent failures were because of running out of DB connections. I got some good clean runs after merging this silly change: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1135451 [21:22:48] Also, interesting, possibly good news: the magnum devs have given up on heat and wrote an entirely new driver which is available in the next release.