[00:16:47] <cwhite>	 The network probes issue seems to have resolved around 17:07Z coinciding with the idp tomcat restart.
[00:52:40] <jinxer-wm>	 FIRING: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures
[00:57:40] <jinxer-wm>	 RESOLVED: LogstashIndexingFailures: Logstash Elasticsearch indexing errors - https://wikitech.wikimedia.org/wiki/Logstash#Indexing_errors - https://grafana.wikimedia.org/d/000000561/logstash?viewPanel=40&var-datasource=codfw%20prometheus/ops - https://alerts.wikimedia.org/?q=alertname%3DLogstashIndexingFailures
[08:19:29] <jynus>	 hey, people, I am observing some weird behaviour
[08:20:14] <jynus>	 I see prometheus100[56] attempting to connect to tcp ganeti1017:1811, but that port is not open
[08:20:44] <jynus>	 https://logstash.wikimedia.org/goto/6101c60e2e292ab1520cd9091b534c42
[08:21:26] <jynus>	 This was causing probe alerts- but I see those not happing anymore (?)
[08:24:23] <jynus>	 maybe this is not obs, but service owner's missconfiguration
[08:24:40] <jynus>	 but maybe you can give me some pointers
[08:25:48] <XioNoX>	 jynus: https://phabricator.wikimedia.org/rOPUPde21a79eedbba78093a37d71f9574aa44a53029a it's being decom or worked on by moritzm 
[08:25:57] <jynus>	 I see
[08:26:09] <jynus>	 same for cassandra: https://phabricator.wikimedia.org/T380236
[08:26:16] <jynus>	 (restbase)
[08:26:55] <jynus>	 I'm going to downtime those probes for a few hours
[08:29:09] <moritzm>	 ganeti1017 is some alert spam which happens if a Ganeti node is removed from the active cluster(s) for decom, they resolve within a few minutes
[08:30:31] <jynus>	 ok. no worries. I was just checking those because yesterday other probes had worse consequences
[08:30:58] <jynus>	 So I though they could be false alerts
[13:20:43] <jinxer-wm>	 FIRING: BenthosKafkaConsumerLag: Too many messages in jumbo-eqiad for group benthos-webrequest-sampled-live-franz - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=jumbo-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-webrequest-sampled-live-franz - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag
[14:30:43] <jinxer-wm>	 RESOLVED: BenthosKafkaConsumerLag: Too many messages in jumbo-eqiad for group benthos-webrequest-sampled-live-franz - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=jumbo-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-webrequest-sampled-live-franz - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag