[11:35:29] I've put together this dashboard to showcase the cadvisor per-unit mem/cpu metrics, let me know what you think https://grafana.wikimedia.org/d/lxIVOKq4k/units-resource-usage-overview [11:59:30] Very cool [12:56:13] folks restaarts of kafka logging clusters completed! [12:56:14] https://grafana.wikimedia.org/d/000000027/kafka?forceLogin&from=now-3h&orgId=1&to=now&var-datasource=thanos&var-kafka_cluster=logging-eqiad&var-cluster=logstash&var-kafka_broker=All&var-disk_device=All&viewPanel=75 [12:56:37] the above shows a nice improvement in idleness of kafka worker threads [13:39:07] neat, thank you for your help elukey <3 [13:39:52] indeed, thank you Luca! [22:11:29] Hey 0lly friends! I'm building a new Zookeeper cluster that will be owned by the new Data Platform SRE team, any idea if should use the 'ops' or 'analytics' prometheus instance? ref: https://gerrit.wikimedia.org/r/c/operations/puppet/+/940243/comments/3782cdc8_5141da89 [22:21:53] inflatador: Based on the current distribution of zookeepers, I would guess "analytics" because druid and an-conf live on that prometheus instance.