[12:13:50] <topranks>	 folks nothing important but I was trying to move the "RPKI" dashboard in Grafana into the "SRE Netops" folder, every time I do it it seems to have worked, but when I refresh it's back in "General" 
[12:29:43] <jinxer-wm>	 FIRING: BenthosKafkaConsumerLag: Too many messages in jumbo-eqiad for group benthos-webrequest-sampled-live-franz - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=jumbo-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-webrequest-sampled-live-franz - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag
[12:41:03] <topranks>	 !restarting gnmic.service on netflow1002 
[13:38:29] <godog>	 topranks: not sure what's going on off the top of my head, I'm looking at sth else right now and will take a look later
[13:38:59] <topranks>	 godog: thanks 
[13:39:01] <godog>	 I'm investigating the BenthosKafkaConsumerLag alert above, looks like centrallog2002 is high on system cpu usage since a couple of days
[13:39:35] <topranks>	 btw I'm looking at the gnmic stuff since I merged my patch.  it's working fine in magru but stats stopped in eqiad, not exactly sure what's wrong (manual scrape is taking 13 seconds so should be ok)
[13:39:44] <topranks>	 if I can't work it out shortly I'll revert my patch 
[13:40:53] <godog>	 topranks: ack
[13:45:08] <godog>	 looks like mtail freaked out at some point around jan 22 00:40, and the excessive cpu was slowing down benthos causing the lag
[13:45:18] <godog>	 should be recovering shortly
[13:59:43] <jinxer-wm>	 RESOLVED: BenthosKafkaConsumerLag: Too many messages in jumbo-eqiad for group benthos-webrequest-sampled-live-franz - TODO - https://grafana.wikimedia.org/d/000000484/kafka-consumer-lag?var-cluster=jumbo-eqiad&var-datasource=eqiad%20prometheus/ops&var-consumer_group=benthos-webrequest-sampled-live-franz - https://alerts.wikimedia.org/?q=alertname%3DBenthosKafkaConsumerLag