[00:04:38] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.04, 21.36, 18.18 [00:05:42] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 13.35, 17.02, 19.82 [00:06:38] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 17.99, 20.06, 18.11 [00:36:26] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [00:37:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 9.83, 19.01, 23.36 [00:38:27] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 274 system event log (SEL) entries present] [00:43:03] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 29.34, 24.51, 24.30 [01:51:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 18.50, 19.23, 23.58 [01:54:26] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [01:56:27] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 276 system event log (SEL) entries present] [01:56:43] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.72, 18.83, 17.53 [02:00:39] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 19.98, 19.22, 17.96 [02:09:03] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.12, 22.40, 21.76 [02:11:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.08, 21.77, 21.61 [02:23:03] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.14, 20.81, 20.93 [02:25:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.40, 20.79, 20.88 [02:33:03] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 29.03, 24.36, 22.20 [02:35:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.40, 23.12, 22.08 [02:39:03] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.77, 18.37, 20.30 [02:41:57] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/522c074a9c5b...ed802074be9c [02:42:00] [02ssl] 07WikiTideSSLBot 03ed80207 - Bot: Add SSL cert for fanojo.wiki [02:50:26] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [02:52:27] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 278 system event log (SEL) entries present] [02:55:03] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 30.05, 22.54, 19.91 [02:57:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.62, 21.80, 19.94 [03:01:03] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.23, 19.80, 19.58 [03:17:03] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.27, 20.74, 19.50 [03:19:03] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.16, 21.39, 19.85 [03:19:40] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.44, 21.48, 18.69 [03:21:36] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.48, 21.59, 19.04 [03:23:33] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.16, 22.06, 19.47 [03:29:22] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.72, 23.93, 21.36 [03:33:14] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.30, 23.51, 21.73 [03:37:07] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.06, 23.24, 22.12 [03:39:04] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.47, 22.96, 22.09 [03:40:52] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 16.13, 23.42, 23.81 [03:41:00] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.42, 21.39, 21.61 [03:41:55] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [03:43:52] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [280 system event log (SEL) entries present] [03:50:42] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 17.02, 19.10, 20.38 [03:50:52] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.65, 21.66, 22.12 [03:52:52] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.40, 21.74, 22.09 [03:54:37] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.34, 20.38, 20.63 [03:56:33] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.87, 22.53, 21.40 [03:58:29] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.03, 20.41, 20.77 [03:58:52] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.03, 21.81, 21.89 [04:00:26] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 14.85, 18.78, 20.15 [04:00:52] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 20.88, 21.12, 21.63 [04:06:52] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 18.67, 18.64, 20.32 [04:29:46] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.41, 20.57, 19.36 [04:33:42] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.44, 19.17, 19.05 [04:39:16] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [04:41:13] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [281 system event log (SEL) entries present] [04:45:28] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.62, 19.38, 17.89 [04:47:26] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.89, 19.92, 18.27 [04:51:22] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.83, 19.53, 18.67 [05:02:14] [Grafana] !tech FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:06:39] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.20, 3.20, 1.51 [05:07:14] [Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:08:38] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 3.11, 2.94, 1.60 [05:18:20] [Grafana] !tech FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:19:37] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.42, 3.74, 2.52 [05:21:37] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.82, 3.07, 2.41 [05:23:20] [Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:27:07] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [05:29:09] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [282 system event log (SEL) entries present] [05:34:36] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 23.19, 19.38, 17.63 [05:36:34] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 26.02, 21.48, 18.58 [05:40:30] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 20.24, 22.34, 19.71 [05:44:26] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 14.75, 19.41, 19.25 [05:52:14] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.38, 21.83, 20.36 [05:56:10] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.01, 23.66, 21.56 [06:11:54] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 27.60, 23.11, 21.90 [06:13:52] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.85, 23.23, 22.12 [06:19:42] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.71, 20.68, 19.46 [06:21:07] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [06:21:39] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 17.73, 19.16, 19.03 [06:23:09] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [283 system event log (SEL) entries present] [06:27:30] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.89, 20.55, 19.78 [06:29:35] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 19.48, 18.61, 20.14 [06:31:23] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 19.84, 20.40, 19.91 [06:39:21] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 21.02, 20.43 [06:41:19] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.25, 20.95, 20.48 [06:42:32] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o