[00:31:05] PROBLEM - SSH on analytics1077.mgmt is CRITICAL: CRITICAL - Socket timeout after 10 seconds https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [03:35:09] RECOVERY - SSH on analytics1077.mgmt is OK: SSH OK - OpenSSH_7.4 (protocol 2.0) https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook [19:37:19] PROBLEM - eventgate-analytics-external validation error rate too high on alert1001 is CRITICAL: 5.634 gt 2 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [19:49:21] RECOVERY - eventgate-analytics-external validation error rate too high on alert1001 is OK: (C)2 gt (W)1 gt 0.8591 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [21:03:53] PROBLEM - eventgate-analytics-external validation error rate too high on alert1001 is CRITICAL: 6.274 gt 2 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [21:25:31] RECOVERY - eventgate-analytics-external validation error rate too high on alert1001 is OK: (C)2 gt (W)1 gt 0.9561 https://wikitech.wikimedia.org/wiki/Event_Platform/EventGate https://grafana.wikimedia.org/d/ZB39Izmnz/eventgate?orgId=1&refresh=1m&var-service=eventgate-analytics-external&var-stream=All&var-kafka_broker=All&var-kafka_producer_type=All&var-dc=thanos [22:27:07] 10Data-Engineering: Failed to find any Kerberos tgt - https://phabricator.wikimedia.org/T318063 (10bmansurov)