[07:40:10] moritzm: if you don't mind a sanity check on https://gerrit.wikimedia.org/r/c/operations/puppet/+/927970 so it unblocks a pending commit on the private repo? [07:40:52] I belive icinga parsing will fail if that is not deployed first [07:41:21] thanks, +1d [07:41:33] sorry for involving you on this [07:45:36] no problem at all :-) [07:48:21] I commited the private patch, in case you want to review that too, will make sure icinga puppet run still works properly [07:48:52] being careful as last time I broke some notifications by unintentionally adding an extra \n [07:50:20] the commit in the private repo also looks good! [07:50:48] thank you for the double check! [07:51:40] FYI I'm going ahead shortly with bumping cadvisor rollout to 20% of eqiad/codfw https://gerrit.wikimedia.org/r/c/operations/puppet/+/927972 [07:52:34] diff on alert1001 also looking good, so expecting no more delivery failures now [07:58:52] 10 [07:58:55] err :) [07:58:57] morning :) [08:02:59] Just bumped to 15 days the retention for webrequest_sampled_live, I checked and the 8 days seemed to work perfectly [08:03:10] if you see any issue please report it in T337460 [08:03:10] T337460: Increase webrequest_sampled_live Druid datasource's retention - https://phabricator.wikimedia.org/T337460 [08:06:06] FYI, there's a regression in the Debian LTS update for ruby2.5 which breaks Puppet on Buster hosts, Taavi opened https://phabricator.wikimedia.org/T338294 [08:07:04] in production we don't have unattended-upgrades, but this would affect potential reimages of buster systems, I'm currently building a package which unbreaks this for apt.wikimedia.org [08:07:27] buster reimages are rare at this point, but if you plan for some, please wait a bit [08:09:14] moritzm: ha. yes, we just got caught by that in fr-tech land. [08:13:58] does frtech use apt.wikimedia.org or is that not an option due to the PCI stuff? I'll upload ruby 2.5.5-3+deb10u5+wmf1 to buster-wikimedia in a bit [08:14:43] we don't. [08:16:12] i'm not certain that we couldn't, just know that we don't currently. [08:17:17] I also reported the regression to the LTS folks, a fixed package should also appear on via buster-security today [08:17:29] great. [08:17:40] we'll be on the lookout for it. [08:17:56] but for now, i should be looking for sleep. have a good one. [08:24:52] ruby 2.5.5-3+deb10u5+wmf1 uploaded to apt.wikimedia.org, so Buster reimages will work fine again [08:29:30] <_joe_> Emperor: do you happen to remember what is the timeout in the 404 handler that swift uses to get thumbnails from thumbor? [08:51:51] I'm not sure one is explicitly configured, let me eyeball the code again [08:56:32] _joe_: the default python3 timeout for urllib, which I think is "no timeout" [08:56:45] <_joe_> Emperor: thanks [13:07:15] duesen: when would like to deploy 927758 ? [13:08:31] I will be afk after 16:00Z, but anyone from serviceops being around should be enough [13:09:12] effie: IO'm in a meeting until the half hour [13:09:42] cheers, ping me [13:16:49] _joe_: are you going to be around for monitoring/reverting later? [13:27:21] effie: o/ [13:27:25] let's do it? [13:29:12] yeah go ahead please [13:52:04] duesen: joe won't be but I'm here and I think so is akosiaris [13:53:39] I 'll be around [14:01:42] fyi: I'm (still) planning to upgrade Cassandra on sessionstore in codfw today (after de-pooling it of course). if no one objects, I'll start in an hour(?) /cc cwhite claime jynus [14:02:20] urandom: No obvious problems foreseen? [14:02:43] ok, no issue. there seems to be some jobqueue deployment thingies right now [14:02:47] claime: none that are obvious :) [14:03:29] jynus: do you think we should wait until there are no more thingies? [14:03:47] not at the moment, but let's see how that goes [14:03:53] 👍 [14:04:01] I think it should be very independent in any case [15:00:02] Ok, I'm going to move forward with the sessionstore Cassandra upgrade [15:00:17] (if there are no objections...) [15:00:49] I don't think it's going to conflict with the situation we're monitoring anyways [15:01:05] yeah [15:01:35] 👍 [15:01:44] the question is if you have enough backup, as technically I finished my oncall now [15:02:45] yesterday some people in the next tz were a bit ill [15:05:49] feel free to call me if you need more hands [15:08:30] knock-on-wood, but even if things go horribly wrong, it shouldn't have impact (we're de-pooled), and if things go too unexpectedly south, I'll just rollback [15:11:54] yeah, not worried about the changes you are doing [15:12:15] but about leaving my workmates on call on their own with too much (potential) work [15:16:53] thanks, I appreciate thatt! [16:16:52] Ok, the upgrade of sessionstore in codfw is done (very anticlimactic), but I'm going to leave it de-pooled for a bit while I generate some test traffic against it. [16:41:38] moritzm: while you're thinking about puppet/ruby packaging things, there's a missing dependency in the puppet package for bookworm: https://phabricator.wikimedia.org/T338195#8906978 [16:41:57] I can make a proper bug someplace if you like :) [16:56:01] andrewbogott: in bookworm, it's just a transitional package that Depends: puppet-agent (which in turn Depends: on that ruby package), I think? https://packages.debian.org/bookworm/puppet [16:57:32] Emperor: yeah, puppet-agent does. But I think we're going to be using the 'puppet' package for a while... it's what cloud-init installs, for example [16:59:59] OK, but correct dependency resolution should have pulled in ruby-sorted-set [17:00:16] because puppet Depends: puppet-agent, which Depends: ruby-sorted-set [17:00:31] so if it didn't end up installed, the package install system has a bug [17:17:39] ok, I'll recheck when I have a chance [19:13:15] I'm repooling codfw sessionstore now