[05:57:30] sukhe: 10+ to the motd of doh hosts :D [06:12:36] <_joe_> 🤦 [06:12:52] <_joe_> I approve indeed [06:19:00] trying to fine tune some main-codfw topics, eqiad.change-prop.transcludes.resource-change gets 500 msg/s and runs with a single partition, I'd spread the load to at least 3 [06:19:23] (4th high traffic topic afaics) [06:25:54] (will do it in a sec) [06:26:32] <_joe_> elukey: there is no reason to really fine-tune things unless we can automate that [06:29:17] _joe_ I think that we could adopt the rule that after a certain threshold (like 500 msg/s) topics must have at least 3 partitions, it is a starting point to balance the brokers (2002, the partition leader, still gets way more inbound traffic than the others). Automating that may be possible, or even creating an alarm that pings us when topics reach a certain critical mass [06:29:43] <_joe_> sure [08:34:22] work on kafka main-eqiad completed for today, the two new brokers are getting more traffic (but we'll need to follow up with more partition moves during the next days) [08:36:45] elukey: well done and thanks <3 [08:37:08] I know this is payment for the work I did on the helm stuff, consider that debt paid :) [08:39:43] ahahha thanks [08:40:15] some payment (I call it karma points :D) also goes to jayme [08:41:21] I'd rather like mine paid in future istio upgrades, please :-p [08:45:52] I'm planning to ditch metrics in graphite not touched for >= 3y in https://gerrit.wikimedia.org/r/c/operations/puppet/+/730427 [08:56:30] jayme: you are an expensive shop :D [10:01:54] godog: looking at Swift replication again (everyone needs a hobby); dispertion in both codfw and eqiad is 100%, ms-fe2005 reports completions 09:50-10:00 today, so that's all good and I can add more load to ms-be2045; ms-fe1005 has 08:30 on the 11th to now (and the oldest host has syslog output from swift-object-replicator recently, so presumably still working), so that cluster is still rebalancing. Is that right? How far behind is [10:01:54] worrying? [10:05:47] Emperor: yeah that's correct, eqiad has finished with dispersion but not replication cycle yet, re: worrying probably in the order of 10d I'd say, depending on how big the rebalance was too [10:07:01] TY, I'll put some more weight onto ms-be2045 shortly [10:07:45] cheers [10:17:19] * effie read desperation vs dispertion [10:48:06] elukey: thanks :D [12:59:38] * topranks errand to run, back in an hour or so