[03:10:40] FIRING: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [03:15:40] RESOLVED: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [03:19:40] FIRING: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [03:24:40] RESOLVED: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [14:39:27] FIRING: ThanosCompactHalted: Thanos Compact has failed to run and is now halted. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/651943d05a8123e32867b4673963f42b/thanos-compact - https://alerts.wikimedia.org/?q=alertname%3DThanosCompactHalted [14:52:33] Feb 26 14:33:43 titan2001 thanos-compact[1673644]: level=error ts=2025-02-26T14:33:43.92888568Z caller=compact.go:487 msg="critical error detected; halting" err="compaction: group 300000@15871545881801114525: compact blocks [/srv/thanos-compact/compact/300000@15871545881801114525/01JMPP46C7B4YA17SD486FYECH /srv/thanos-compact/compact/300000@15871545881801114525/01JM5936FMPW3HXHWCCRXG63A0]: 3 errors: populate block: write chunks: [14:52:33] preallocate: no space left on device; sync /srv/thanos-compact/compact/300000@15871545881801114525/01JN141M38D4683ZZYQN8JNJDG.tmp-for-creation/chunks/000329: file already closed; write /srv/thanos-compact/compact/300000@15871545881801114525/01JN141M38D4683ZZYQN8JNJDG.tmp-for-creation/index_tmp_p: no space left on device" [14:52:43] although at the moment the fs looks 50% free [15:11:40] FIRING: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [15:16:40] RESOLVED: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [17:11:40] FIRING: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [17:15:20] Hi o11y folks, I'd like your help migrating thanos-query and thanos-web LB VIPs to IPIP encapsulation and maglev as part of the ongoing migration to liberica. Relevant tasks are T387291 && T387292. i've already submitted the CRs https://gerrit.wikimedia.org/r/q/topic:%22T387291%22, I'd need your help reviewing those CRs and coordinating the migration with me, I can happily take care of all the on-hands work [17:15:20] T387291: Migrate thanos-query LB VIPs to IPIP encapsulation - https://phabricator.wikimedia.org/T387291 [17:15:21] T387292: Migrate thanos-web LB VIPs to IPIP encapsulation - https://phabricator.wikimedia.org/T387292 [17:16:40] RESOLVED: [2x] LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures [17:49:44] Thanks vgutierrez , I'll look at the tasks and patches. [17:50:08] thx [17:59:38] I'll proceed tomorrow eu morning after pinging you folks. Thx for the review denisse [18:39:27] FIRING: ThanosCompactHalted: Thanos Compact has failed to run and is now halted. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/651943d05a8123e32867b4673963f42b/thanos-compact - https://alerts.wikimedia.org/?q=alertname%3DThanosCompactHalted [18:54:27] RESOLVED: ThanosCompactHalted: Thanos Compact has failed to run and is now halted. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/651943d05a8123e32867b4673963f42b/thanos-compact - https://alerts.wikimedia.org/?q=alertname%3DThanosCompactHalted [21:58:27] FIRING: ThanosCompactHalted: Thanos Compact has failed to run and is now halted. - https://wikitech.wikimedia.org/wiki/Thanos#Alerts - https://grafana.wikimedia.org/d/651943d05a8123e32867b4673963f42b/thanos-compact - https://alerts.wikimedia.org/?q=alertname%3DThanosCompactHalted [22:24:43] FIRING: BenthosKafkaConsumerLag: Too many messages in jumbo-eqiad for group benthos-webrequest-sampled-live-franz - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=jumbo-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-webrequest-sampled-live-franz - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag [22:29:43] RESOLVED: BenthosKafkaConsumerLag: Too many messages in jumbo-eqiad for group benthos-webrequest-sampled-live-franz - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=jumbo-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-webrequest-sampled-live-franz - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag