[04:24:58] Done: [04:24:58] • attended to all code reviews [04:24:58] Doing: [04:24:59] • resumed work on https://phabricator.wikimedia.org/T359650 ([jobs-api] save business models in a DB) [04:24:59] Blockers: [04:24:59] • https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-cli/-/merge_requests/81 [04:25:00] • https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/136 [04:25:00] • https://gitlab.wikimedia.org/repos/cloud/toolforge/jobs-api/-/merge_requests/119 [08:00:23] Cteam: welcome to today 🦄! Don’t forget to post your update in thread. [08:00:23] Feel free to include: [08:00:23] 1. 🕫 Anything you'd like to share about your work [08:00:23] 2. ☏ Anything you'd like to get help with [08:00:23] 3. ⚠ Anything you're currently blocked on [08:00:23] (this message is from a toolforge job under the admin project) [09:26:04] Done: [09:26:04] * [alerts] Fixed the integration between metricsinfra/production karma, for some reason the network between metricsinfra prometheus and one of the alertmanagers was not working (blocked port 9093, but not 9001, a restart of the alertmanager vm did the trick) - T384200 [09:26:04] * [lima-kilo] Fixed the runs with the latest limactl (>1.0) and the LIMA_DATA_* warnings [09:26:04] Doing: [09:26:04] * [components-cli] Working on getting it installed by default on lima-kilo, I have to rename the packages (tools-components-cli vs toolforge-components-cli) [09:26:04] * [toolforge] gave a small review to the toolforge planning miro board, still on it https://miro.com/app/board/uXjVLti6_Lw=/ [09:26:05] * [jobs-api] gave some thought to the refactor with Raymond_Ndibe (https://docs.google.com/document/d/1hkUMuJ3JFszCWqGkA1vfLoQKnae4YND29oORppN7lKo/edit?tab=t.0) [09:26:06] * [ceph,QoS] have to finish up the patches for persisting the config (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1109454), and see if we can add a new host to test the load [09:26:06] * [jobs-api] started looking into the 137 errors that were reported (T382865), it's due to the workers getting out of disk space locally, fixed manually for now and spawned a couple tasks to fix permanently [09:26:06] * [jobs-emailer] merge and deploy the patches adding monitoring, there's more fixes needed after [09:26:07] * [toolforge,bastion container] I have to retake this, now that we can load configs from the environment for toolforge clis [09:26:08] * [planning] will review the Q3 planning doc + Q1/Q2 boards once the toolforge plans are a bit clearer [09:26:21] * [ceph,grafana/gnmi] have to migrate the last graph, we might not have the data for drops on cloudsws [09:26:27] Blockers: [09:26:32] * [cookbooks,ceph] sent a patch to allow answering "all" after a bit when restarting osd daemons https://gerrit.wikimedia.org/r/c/cloud/wmcs-cookbooks/+/1112754 [09:26:42] * Time 🕐 [12:16:29] Working on: [12:16:31] * T382961 Kernel error metrics [12:16:33] * T384293 [wmcs-cookbooks] Add owner property [12:16:35] Added to backlog (feel free to take it): [12:16:37] * T384296 [wmcs-cookbooks] Remove redundant SAL logging