[08:46:49] !log admin [codfw1dev] restart rabbitmq @ codfw1dev T374002 [08:46:55] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Admin/SAL [08:46:55] T374002: codfw1dev: rabbitmq is not working because some auth failures - https://phabricator.wikimedia.org/T374002 [19:39:22] if any network wizards fancy a challenge: https://phabricator.wikimedia.org/T374152 [19:39:27] I think I’ve reached the limit of my own debugging skills there :/ [19:52:32] I've found that adding a health check script increases reliability by like 200%, and that's just from scaring the pod into behaving properly [19:54:19] I haven't had a stuck connection but I did get recurring connection reset by peer from eventstreams [20:34:49] *nods* it’s probably ultimately a good idea either way [20:34:57] I just wouldn’t mind solving the underlying problem first ;) [22:06:14] merhaba