[09:03:52] new version of karma! thanks @denisse (and company) :) [14:02:45] yw dcaro [16:27:17] (LogstashKafkaConsumerLag) firing: Too many messages in logging-eqiad for group logstash7-eqiad - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [16:32:04] ^^ looks like a spike in log ingest - monitoring [16:32:17] (LogstashKafkaConsumerLag) resolved: Too many messages in logging-eqiad for group logstash7-eqiad - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [16:37:17] (LogstashKafkaConsumerLag) firing: Too many messages in logging-eqiad for group logstash7-eqiad - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [16:57:17] (LogstashKafkaConsumerLag) resolved: Too many messages in logging-eqiad for group logstash7-eqiad - https://wikitech.wikimedia.org/wiki/Logstash#Kafka_consumer_lag - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashKafkaConsumerLag [20:09:39] cwhite I noticed you have a repo for Curator ( https://gitlab.wikimedia.org/repos/sre/curator ) , I was going to cut a debian package using curator v7.0.1 ...any interest and/or advice on the process? [20:19:28] inflatador: this repackaging of curator is for a spicerack dependency? [20:20:34] cwhite Y, it will unblock T345337 , but we also need it because our version of curator is really old and has been broken on prod ES since before my time [20:20:35] T345337: spicerack: tox fails to install PyYAML using python 3.11 on bookworm - https://phabricator.wikimedia.org/T345337 [20:21:58] That repo is our fork which hacked in OpenSearch support. It does this by choosing the client library to use based on an env var setting. It wasn't made with library usage in mind. [20:23:05] No problem, was just checking to see what's out there already. Back to the "fun" of making a deb pkg, I guess ;P [20:23:40] you have interesting definitions of fun inflatador ;-) [20:28:11] Personally knowing how not fun it was to extend and build our fork, I wish you good luck. Hopefully the situation has improved since mid-2022. [20:36:47] inflatador: you might consider the elasticsearch-curator packages from pypi unless a deb is a hard requirement. It seems in 2023, elastic started pushing packages for each major version. [21:03:24] lmata what can I say, I'm a sick man ;P [21:06:52] cwhite I think it's a hard req for our installation at least. Might not be for spicerack, I'll check w v-olans tomorrow [21:16:50] (ThanosQueryInstantLatencyHigh) firing: Thanos Query Frontend has high latency for queries. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/aa7Rx0oMk/thanos-query-frontend - https://alerts.wikimedia.org/?q=alertname%3DThanosQueryInstantLatencyHigh [21:17:28] ^ Taking a look. [21:19:25] I don't see any mitigation in our Thanos runbooks for ThanosQueryInstantLatencyHigh alerts. [21:20:37] However, looking at the graphs I notice that the latency of requests to p99 thanos-query-frontend is decreasing, it may resolve on its own. [21:21:21] The querier cache gets vs misses graph also shows a decrease in the number of missed requests. [21:33:20] (ThanosQueryInstantLatencyHigh) resolved: Thanos Query Frontend has high latency for queries. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/aa7Rx0oMk/thanos-query-frontend - https://alerts.wikimedia.org/?q=alertname%3DThanosQueryInstantLatencyHigh