[13:00:10] Krinkle: mailman is owned by sre collab but yeah the rest is I/F [13:27:25] hello oncallers [13:27:47] due to https://phabricator.wikimedia.org/T383114 I'd need to stop puppet fleetwide for a bit [13:27:58] lemme know if it is a good time, or if there are any blockers [13:33:27] proceeding [13:36:01] jayme: o/ [13:36:09] is --^ going to impact your work? [13:36:48] hmm...I'm doing eqiad only currently. But jelto is doing work in codfw [13:38:19] it is invasive I know but the postgres replication is broken atm :( [13:38:26] elukey: should be fine, my cookbooks are finished [13:38:42] okok [13:38:45] thanks! [13:39:04] elukey: yeah...go ahead. reimages break for all kinds of reasons all the time - at least this is a proper reason :) [13:41:18] Hmmm. apt is telling me that the "puppet" package is unneeded on a host and could be removed. I presume this is not in error? [13:42:57] it seems an error yes [13:43:22] puppet-agent is its own package, so I wasn't sure [13:43:43] mmm yes it is a transitional package, but we have everywhere afaics [13:43:51] is it due to the rocm cleanup? [13:44:17] only a bit. I basically removed all packages that had files in /opt/rocm* [13:44:29] But eg. the fakelib packages are separate [13:44:46] So I ran apt --no-act autoremove and got a bunch of packages including puppet [13:45:32] https://phabricator.wikimedia.org/P71826 [13:46:02] e.g. the libdrm stuff can probably be removed, but as I said I dunno about puppet [13:46:21] Description-en: transitional dummy package This is a transitional dummy package. It can safely be removed. [13:46:33] didn't know it, so I guess you are good [13:46:41] alright, will give the list another once-over and proceed [13:47:11] ah yes also Version: 5.5.22-2+deb12u4 [13:47:15] yes yes go ahead [13:47:30] --- [13:47:39] backup from puppetdb1003 in progress [13:51:12] there is no indicator of completion time so not sure at what stage we are [13:54:36] postgres up on 2003 [13:55:22] the alert seems gone [13:56:45] klausman: yeah, on puppet 7 hosts it can be safely removed, we'll clean this out via puppet once the last puppet 5 nodes are gone [13:59:47] Aye, cap'n [14:00:36] moritzm: I've ran puppet on puppetdb2003 and it worked fine, replication seems ok, I'll re-enable puppet if you are ok [14:01:59] (I think we are safe to go, re-enabling) [14:06:07] ack, seems to to-reeanble, I just tested the puppetdb backend of Cumin and it also works fine [14:06:30] aand done [14:09:52] elukey: So puppet is enabled again and I can proceed with reimaging hosts :)? [14:10:03] yep! [14:10:34] great thanks for the quick fix [16:14:48] Hello folks. Heads up: As part of https://phabricator.wikimedia.org/T382953, which runs a reconciliation process, the Data Eng team intends to bump up our Action API usage for the next 2 weeks or so. What is a reasonable rps target for temporary heavy internal use? Would 10K rps be ok? This would mostly be hitting enwiki, commonswiki and wikidatawiki APIs. [16:18:54] xcollazo: 10k rps is like 3 times the usual traffic patterns for mw-api-int (see grafana dashboard at https://grafana.wikimedia.org/d/t7EiVbdVk/mw-api-int?orgId=1&viewPanel=10&from=now-7d&to=now&var-dc=codfw%20prometheus%2Fk8s&var-service=mediawiki&var-namespace=mw-api-int&var-release=main&var-container_name=All) [16:19:09] go for an order of magnitude less I 'd say and it's probably ok [16:23:45] akosiaris: ack, thanks. [17:25:28] elukey: what's a good default value for profile::puppetdb::database::wal_keep_segments for my cloud-vps puppetdb hosts? [17:31:04] * andrewbogott opts for a modest 64 [17:49:47] if you have replicas personally I'd set it higher, it'll use a maximum of wal_keep_segments * 16 MB of space and it's definitely a case where having too few will be painful and having too many will be something you never notice [17:55:34] looks like it as defaulting to 128 before so I'm going with that until we have a problem [17:55:36] thx [18:45:04] andrewbogott: o/ I thought I had take care of defaults via the profile's defaults, but I forgot that I needed to set it for cloud as well. Sorry, hope I haven't broken too many things [18:45:30] Definitely not a big deal, I just added that value to a few projects. [18:47:01] https://gerrit.wikimedia.org/r/c/operations/puppet/+/1108798 seems also fine [18:47:09] so it defaults to 128 [18:47:20] cc: dcaro: (thanks for the fix) [19:33:07] inflatador: I'm trying to grasp the status of the cloudelastic cluster; it looks like you've had some recent knowledge/involvement, are you still the one to ask? [19:33:36] I see that cloudelastic1005 and cloudelastic1006 are due for decom but still installed and running... are they now obsolete, or did we accidentally forget to order replacement hardware, or...? [19:41:09] ah! The new hardward is https://phabricator.wikimedia.org/T378368 with no assignee :)