[00:17:07] <icinga-wm>	 RECOVERY - Check unit status of monitor_refine_eventlogging_analytics on an-launcher1002 is OK: OK: Status of the systemd unit monitor_refine_eventlogging_analytics https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[00:30:23] <icinga-wm>	 PROBLEM - Check unit status of monitor_refine_eventlogging_analytics on an-launcher1002 is CRITICAL: CRITICAL: Status of the systemd unit monitor_refine_eventlogging_analytics https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_timers
[03:01:37] <wikibugs>	 (03CR) 10AntiCompositeNumber: "recheck" [analytics/quarry/web] - 10https://gerrit.wikimedia.org/r/752034 (owner: 10EpicPupper)
[10:41:56] <elukey>	 !log restart hive daemons on an-coord1002 (after my last upgrade/rollback of packages the prometheus agent settings were not picked up, so no metrics)
[10:41:58] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log
[10:46:29] <icinga-wm>	 PROBLEM - Hive Server on an-coord1002 is CRITICAL: PROCS CRITICAL: 0 processes with command name java, args org.apache.hive.service.server.HiveServer2 https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive
[10:49:57] <elukey>	 uff
[10:51:05] <icinga-wm>	 RECOVERY - Hive Server on an-coord1002 is OK: PROCS OK: 1 process with command name java, args org.apache.hive.service.server.HiveServer2 https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hive
[10:51:21] <elukey>	 !log start hive-server2 on an-coord1002 - failed to connect to the metastore due to restart
[10:51:23] <stashbot>	 Logged the message at https://www.mediawiki.org/wiki/Analytics/Server_Admin_Log