[00:00:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.36, 23.31, 22.93 [00:02:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.66, 22.58, 22.73 [00:02:51] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.18, 4.22, 3.73 [00:06:41] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.91, 3.93, 3.77 [00:10:53] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - franchise.franchising.org.ua All nameservers failed to answer the query. [00:12:31] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.16, 2.57, 3.24 [00:19:20] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 15.26, 18.44, 20.21 [00:24:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.01, 3.83, 3.37 [00:24:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.25, 19.53, 20.33 [00:27:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.69, 22.35, 21.28 [00:29:20] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 13.36, 19.33, 20.33 [00:30:32] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.38, 3.05, 3.16 [00:34:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.52, 3.49, 3.37 [00:34:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.33, 21.43, 20.68 [00:35:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.78, 21.46, 20.84 [00:36:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.93, 4.22, 3.63 [00:37:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.55, 21.12, 20.78 [00:38:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.80, 3.60, 3.45 [00:41:20] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 18.12, 19.81, 20.37 [00:42:31] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.23, 2.97, 3.29 [00:46:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.60, 3.74, 3.54 [00:46:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.98, 18.85, 20.00 [00:50:31] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.66, 2.77, 3.20 [00:54:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:55:40] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:56:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.97, 3.67, 3.43 [00:57:34] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.065 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:00:32] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.43, 3.03, 3.21 [01:03:49] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 21.93, 20.79 [01:05:44] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.33, 21.74, 20.82 [01:09:00] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:10:18] PROBLEM - os162 Current Load on os162 is CRITICAL: LOAD CRITICAL - total load average: 8.09, 7.60, 7.43 [01:10:53] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for franchise.franchising.org.ua could not be found [01:11:22] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.36, 4.03, 3.50 [01:12:18] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.37, 7.51, 7.42 [01:13:17] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.88, 3.52, 3.37 [01:15:12] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.47, 3.53, 3.36 [01:15:24] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.49, 19.51, 19.17 [01:17:07] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.37, 3.40, 3.36 [01:17:21] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.66, 19.02, 19.02 [01:20:59] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.65, 3.80, 3.53 [01:21:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.47, 22.75, 21.77 [01:22:54] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.93, 3.35, 3.41 [01:23:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.57, 21.85, 21.57 [01:24:49] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.64, 3.80, 3.57 [01:25:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.40, 23.18, 22.08 [01:26:44] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.47, 3.90, 3.65 [01:26:58] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.33, 21.11, 20.16 [01:28:13] PROBLEM - wiki.buryland.net - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.buryland.net' expires in 15 day(s) (Mon 02 Sep 2024 01:02:22 AM GMT +0000). [01:28:26] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/27b3d8b007d0...c34a715a6a15 [01:28:28] [02ssl] 07WikiTideSSLBot 03c34a715 - Bot: Update SSL cert for wiki.buryland.net [01:28:55] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.06, 20.04, 19.90 [01:29:04] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 21.05, 19.89, 18.07 [01:29:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.75, 22.53, 22.12 [01:30:33] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.94, 3.12, 3.36 [01:31:02] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 17.30, 18.61, 17.81 [01:35:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.05, 22.32, 22.09 [01:36:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.26, 21.49, 20.43 [01:39:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.95, 22.74, 22.36 [01:40:21] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.19, 21.46, 18.69 [01:40:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.64, 21.53, 20.80 [01:42:21] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 19.67, 20.50, 18.67 [01:44:21] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.93, 19.01, 18.36 [01:45:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.11, 22.80, 22.37 [01:47:01] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.24, 3.05, 2.90 [01:47:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.58, 22.67, 22.38 [01:49:32] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:51:29] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [01:54:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:54:43] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.92, 3.25, 3.24 [01:54:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.82, 19.46, 20.35 [01:57:27] RECOVERY - wiki.buryland.net - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.buryland.net' will expire on Fri 15 Nov 2024 12:29:50 AM GMT +0000. [01:58:36] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.45, 4.15, 3.59 [01:59:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:05:20] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 19.34, 19.18, 20.33 [02:08:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.53, 3.53, 3.70 [02:09:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.79, 20.22, 20.45 [02:14:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.21, 3.85, 3.72 [02:31:37] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:31:38] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:34:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:34:34] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 13 minutes ago with 0 failures [02:36:36] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [02:39:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.01, 23.13, 21.90 [02:41:02] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.21, 18.74, 16.38 [02:41:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.48, 22.57, 21.85 [02:42:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.45, 2.96, 3.86 [02:42:32] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.26, 20.48, 18.77 [02:43:00] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 19.89, 19.01, 16.77 [02:44:28] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.02, 22.10, 19.56 [02:46:25] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.45, 21.80, 19.77 [02:46:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.12, 3.35, 3.78 [02:46:54] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 20.85, 19.85, 17.60 [02:50:50] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 15.37, 18.95, 17.87 [02:52:15] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.93, 19.90, 19.77 [02:59:54] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:03:19] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:04:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:10:19] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:11:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 27.81, 24.29, 22.48 [03:13:17] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [03:13:18] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [03:13:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.15, 23.16, 22.27 [03:15:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.74, 23.81, 22.62 [03:16:26] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.90, 20.97, 20.16 [03:17:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.96, 23.65, 22.72 [03:18:37] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 28 minutes ago with 0 failures [03:18:40] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [03:21:52] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 1.470 second response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:22:16] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.92, 20.07, 20.10 [03:24:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:28:59] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.72, 21.15, 20.53 [03:30:01] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:30:40] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:31:56] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.222 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:32:37] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:34:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.87, 20.24, 20.36 [03:40:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.70, 20.69, 20.64 [03:43:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.43, 22.92, 22.52 [03:45:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.65, 21.75, 22.20 [03:46:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.45, 19.87, 20.40 [03:48:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.41, 2.49, 3.76 [03:50:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.46, 20.12, 20.42 [03:52:32] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.33, 3.45, 3.82 [03:52:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.39, 19.94, 20.33 [03:54:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.94, 3.06, 3.65 [04:00:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.36, 3.40, 3.63 [04:02:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.09, 3.09, 3.48 [04:03:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.38, 22.24, 22.10 [04:04:31] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.66, 2.79, 3.34 [04:05:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.18, 21.49, 21.84 [04:09:00] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:09:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.54, 23.51, 22.56 [04:13:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.98, 23.50, 22.76 [04:17:23] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.78, 4.02, 3.53 [04:19:18] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.85, 3.47, 3.39 [04:19:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.44, 24.22, 23.20 [04:21:13] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.92, 3.25, 3.34 [04:25:05] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.91, 4.59, 3.87 [04:28:43] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:28:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.35, 19.57, 17.96 [04:29:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:29:10] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:29:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.88, 22.71, 23.24 [04:30:51] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.47, 18.50, 17.73 [04:31:09] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:31:21] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:34:47] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.084 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:36:21] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:36:54] PROBLEM - aryavratpedia.co - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'aryavratpedia.co' expires in 15 day(s) (Mon 02 Sep 2024 04:15:46 AM GMT +0000). [04:37:05] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/c34a715a6a15...b4efb9b5f3ce [04:37:08] [02ssl] 07WikiTideSSLBot 03b4efb9b - Bot: Update SSL cert for aryavratpedia.co [04:37:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.68, 22.69, 22.88 [04:43:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 18.43, 22.52, 23.02 [04:53:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.02, 21.21, 21.80 [04:55:41] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.14, 20.61, 18.65 [04:57:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.67, 21.38, 21.82 [04:57:38] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.81, 19.95, 18.63 [04:59:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.08, 22.72, 22.24 [05:01:28] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.74, 22.20, 19.87 [05:02:13] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:03:25] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.93, 22.78, 20.34 [05:05:48] RECOVERY - aryavratpedia.co - LetsEncrypt on sslhost is OK: OK - Certificate 'aryavratpedia.co' will expire on Fri 15 Nov 2024 03:38:29 AM GMT +0000. [05:05:50] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:07:18] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.75, 22.59, 20.86 [05:10:22] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.299 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:10:44] RECOVERY - prometheus151 APT on prometheus151 is OK: APT OK: 51 packages available for upgrade (0 critical updates). [05:11:11] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.23, 24.01, 21.77 [05:13:08] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.99, 23.27, 21.77 [05:13:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.88, 23.53, 23.49 [05:15:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.21, 25.05, 24.03 [05:16:41] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [05:17:01] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.03, 23.83, 22.25 [05:18:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:20:55] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.91, 22.32, 22.07 [05:21:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.56, 23.10, 23.62 [05:23:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:23:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.17, 25.16, 24.32 [05:24:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.39, 22.70, 22.22 [05:28:26] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 21.91, 20.10, 17.75 [05:28:28] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.05, 20.69, 18.32 [05:28:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.44, 22.98, 22.51 [05:30:26] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 14.37, 17.64, 17.12 [05:30:28] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 16.71, 19.03, 17.99 [05:36:32] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.52, 2.91, 3.74 [05:38:26] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 22.29, 20.62, 18.43 [05:38:27] PROBLEM - mw172 Current Load on mw172 is CRITICAL: LOAD CRITICAL - total load average: 25.00, 20.23, 18.11 [05:38:28] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 20.81, 19.23, 18.27 [05:40:25] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 17.00, 18.78, 17.84 [05:40:26] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 17.64, 19.70, 18.38 [05:40:28] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 13.81, 17.16, 17.63 [05:42:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.03, 4.24, 4.03 [05:43:51] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [05:54:51] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.94, 23.48, 22.27 [05:56:51] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.77, 22.63, 22.11 [05:58:31] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.79, 3.15, 3.87 [06:00:31] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.62, 3.69, 3.97 [06:04:32] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:08:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:08:51] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/b4efb9b5f3ce...2121c993060e [06:08:54] [02ssl] 07WikiTideSSLBot 032121c99 - Bot: Add SSL cert for wiki.sletat.tech [06:11:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.62, 22.72, 23.93 [06:11:45] [02dns] 07Reception123 pushed 031 commit to 03master [+1/-0/±0] 13https://github.com/miraheze/dns/compare/b8adfb0e0c7b...befac0d831e3 [06:11:46] [02dns] 07Reception123 03befac0d - Create kingdomway.wiki [06:12:15] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/2121c993060e...f4a863d84f92 [06:12:17] [02ssl] 07WikiTideSSLBot 03f4a863d - Bot: Add SSL cert for wiki.stag.lol [06:12:36] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/f4a863d84f92...795703876dde [06:12:37] [02ssl] 07WikiTideSSLBot 037957038 - Bot: Add SSL cert for wiki.potatotransit.xyz [06:13:20] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.14, 23.01, 23.88 [06:13:37] [02dns] 07Reception123 pushed 031 commit to 03master [+1/-0/±0] 13https://github.com/miraheze/dns/compare/befac0d831e3...a144df82c89b [06:13:40] [02dns] 07Reception123 03a144df8 - Create oxboxtra.wiki [06:14:30] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/795703876dde...d89c02c3fd21 [06:14:32] [02ssl] 07WikiTideSSLBot 03d89c02c - Bot: Add SSL cert for wiki.namegames.dev [06:14:45] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.071 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:15:20] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.68, 22.33, 23.51 [06:18:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:18:59] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.02, 23.54, 23.68 [06:22:20] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.80, 3.13, 3.79 [06:22:52] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.50, 23.33, 23.62 [06:23:00] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.34, 22.44, 21.85 [06:24:15] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.19, 3.53, 3.86 [06:24:48] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.70, 24.26, 23.93 [06:24:58] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 21.96, 19.35, 17.62 [06:24:58] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.62, 21.97, 21.78 [06:26:08] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 20.82, 19.14, 17.56 [06:26:10] PROBLEM - os162 Current Load on os162 is CRITICAL: LOAD CRITICAL - total load average: 8.16, 7.59, 7.38 [06:26:45] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.90, 23.27, 23.61 [06:26:57] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 18.33, 18.74, 17.59 [06:28:03] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.71, 3.81, 3.97 [06:28:04] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 16.65, 18.12, 17.39 [06:28:09] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.61, 7.56, 7.39 [06:30:00] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.30, 5.09, 4.42 [06:32:34] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.89, 24.16, 23.80 [06:35:48] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:37:33] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:37:47] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [06:38:24] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.36, 23.75, 23.81 [06:39:33] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.087 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:40:46] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.39, 23.10, 22.43 [06:44:14] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.31, 22.60, 23.11 [06:46:10] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 18.73, 22.18, 22.96 [06:46:42] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.28, 23.07, 22.76 [06:46:52] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.31, 19.49, 18.26 [06:47:32] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [06:48:49] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.46, 17.82, 17.79 [06:53:55] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.77, 22.88, 22.92 [06:56:33] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.09, 22.78, 22.15 [07:00:33] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.74, 23.05, 22.43 [07:02:33] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.44, 23.88, 22.83 [07:03:00] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:04:33] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.58, 23.81, 22.97 [07:08:00] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:09:58] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 0.18, 2.16, 3.69 [07:11:57] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 0.20, 1.50, 3.26 [07:13:50] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.03, 23.20, 23.69 [07:22:33] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.15, 22.66, 22.30 [07:24:33] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.83, 21.98, 22.12 [07:25:50] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.57, 23.58, 23.33 [07:29:50] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.83, 23.64, 23.41 [07:30:33] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.39, 23.03, 22.38 [07:32:33] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.23, 22.11, 22.16 [07:35:50] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.79, 23.66, 23.36 [07:38:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:39:26] PROBLEM - os162 Current Load on os162 is CRITICAL: LOAD CRITICAL - total load average: 8.24, 7.65, 7.45 [07:39:50] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.78, 23.78, 23.54 [07:41:25] PROBLEM - os162 Current Load on os162 is WARNING: LOAD WARNING - total load average: 7.42, 7.58, 7.45 [07:42:25] [Grafana] FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:44:49] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:47:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:53:34] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o