[01:19:15] * bd808 off [09:14:49] morning [09:17:11] morning [09:18:14] o/ [09:31:08] morning [09:50:14] what do you think about these 2 patches? https://gerrit.wikimedia.org/r/c/operations/puppet/+/1005764 [09:59:41] lgtm, can you run pcc on the last one? [09:59:46] just to make sure there's no changes [10:00:20] * dcaro is getting lost in playing with prometheus/grafana stuff [10:12:40] I think there is already one PCC run? the experimental one [10:14:30] I did not notice xd [10:17:44] got a question, otherwise looks ok [10:19:13] hmm, maybe the requires of the file directly could be replaced by requiring the ceph class that creates the file instead? [10:19:37] that way there's no indirect dependency [10:20:23] (if it already does that's ok, I'm not sure just reading the patches, will have to check the repo) [10:22:14] mmm, that may be indeed a good idea, let me explore that [10:25:06] I think that currently the common node is in the roles directly (so maybe adding the dependency there?) role::wmcs::openstack::*::virt* [10:25:51] it's where we have the rbd_libvirt profile and the profile::openstack::*::nova::compute::service [10:26:18] is it allowed to declare a dependency like that at the role level? [10:26:21] maybe not, as it should be inside profile::openstack::eqiad1::nova::compute::service that it does depend on the ceph::config somehow [10:26:49] (so if we use it somewhere else it will complain or include it) [10:27:08] I think it might not be yep xd [10:47:48] thanks, will send an update shortly [12:12:45] arturo: are you still planning to upgrade the last k8s control plane node today? [12:13:03] taavi: no, my plan is doing it next monday [12:13:08] ok [12:13:16] I don't want to disturb the control plane on fridayt [12:13:19] gotcha [12:13:35] do you want to talk about the 1.24 upgrade now or wait until the control plane os has been upgraded? [12:13:52] happy to talk now [12:13:53] s/talk/think/ [12:13:58] happy to think now [12:14:10] cool [12:14:29] (cc Raymond_Ndibe who wanted to follow along too) [12:14:53] https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Upgrading_Kubernetes is fairly up-to-date, I recommend starting from reading it [12:16:41] * arturo reading [12:16:43] you should start from reading the k8s changelog and checking that all the software deployed in the cluster supports 1.24 [12:18:51] the phabricator task about this upgrade is T307651 [12:18:52] T307651: Upgrade Toolforge Kubernetes to version 1.24 - https://phabricator.wikimedia.org/T307651 [12:19:03] ack [12:24:57] I think there are 2 new cookbooks compared to the last time I did this [12:25:00] will read them [12:28:03] yes. the prepare update is new new, the upgrade worker one is a convertion of your script in puppet.git to a proper cookbook [12:28:39] ok [12:37:46] ok, I've read all the code, and I think I understand what is going on [12:52:09] 1.24 added this `suspend` feature to the jobs API https://github.com/kubernetes/enhancements/issues/2232 [12:52:21] may be interesting for us in the toolforge jobs framework [12:59:27] so I guess we could explore the upgrade to 1.24 in toolsbeta next week [14:15:55] andrewbogott: if you have a moment, https://gitlab.wikimedia.org/repos/cloud/toolforge/disable-tool/-/merge_requests/15 [14:16:03] (sent a few patches to that repo, but this one is the most urgent) [14:47:56] * dcaro back [14:50:35] what is the use cases for the `suspend` feature? [14:51:14] it seems quite low-level ("to be used by future orchestrators") [15:39:05] dcaro: is `suspend` is set to true, then all pods for a job are deleted. But the job stays defined. This allows some higher-level abstraction (i.e, toolforge jobs framework) to introduce additional orchestration semantics [15:39:20] I think is just a low level primitive, to allow higher level semantics [15:39:56] that's what I meant yes, so maybe not something we should expose to users unless there's some use-case for them [15:40:04] I don't think I have seen a lot of people requesting this anyway [15:40:22] might be used by tekton though xd [15:41:15] s/a lot/any/s even [16:34:01] * dcaro off [16:34:03] cya on monday [16:34:42] o/ [17:01:47] * arturo offline [19:18:00] * bd808 lunch