[16:19:25] FIRING: [2x] SystemdUnitFailed: burrow-logging-codfw.service on kafkamon2003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:19:51] ^ This may be related to the burrow change I was working on, taking a look. [16:29:25] FIRING: [3x] SystemdUnitFailed: burrow-jumbo-eqiad.service on kafkamon1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [16:33:29] I sent a patch to fix the issue, and merging and applying it. [16:39:49] My hotfix didn't work, I'm reverting both patches to end the outages while I continue debugging the issue. [16:44:25] RESOLVED: [3x] SystemdUnitFailed: burrow-jumbo-eqiad.service on kafkamon1003:9100 - https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state - https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status - https://alerts.wikimedia.org/?q=alertname%3DSystemdUnitFailed [19:11:43] FIRING: BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag [19:16:43] FIRING: [2x] BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag [21:46:43] FIRING: [2x] BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag [21:56:43] FIRING: [2x] BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag [23:21:43] FIRING: [2x] BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag [23:31:43] FIRING: [2x] BenthosKafkaConsumerLag: Too many messages in logging-eqiad for group benthos-mw-accesslog-metrics - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=logging-eqiad&var-datasource=eqiad%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag