[00:04:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.64, 7.57, 6.53 [00:06:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.55, 7.38, 6.57 [00:10:16] RECOVERY - puppet21 Current Load on puppet21 is OK: LOAD OK - total load average: 5.35, 6.43, 6.38 [01:09:06] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.36, 6.91, 6.48 [01:13:01] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.27, 7.47, 6.80 [01:14:59] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.16, 6.99, 6.70 [01:16:56] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.47, 7.41, 6.89 [01:22:49] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.66, 7.67, 7.35 [01:34:34] RECOVERY - puppet21 Current Load on puppet21 is OK: LOAD OK - total load average: 6.30, 6.44, 6.80 [01:40:34] RECOVERY - rosettacode.org - reverse DNS on sslhost is OK: SSL OK - rosettacode.org reverse DNS resolves to cp5.wikitide.net - NS RECORDS OK [01:44:20] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.92, 7.28, 6.93 [02:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [02:01:53] RECOVERY - www.rosettacode.org - reverse DNS on sslhost is OK: SSL OK - www.rosettacode.org reverse DNS resolves to cp5.wikitide.net - NS RECORDS OK [02:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.405 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [02:46:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.27, 7.62, 7.92 [02:48:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.62, 8.00, 8.02 [02:50:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.78, 7.92, 7.99 [02:52:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.11, 7.97, 8.00 [02:54:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 5.83, 7.35, 7.78 [02:58:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.50, 7.80, 7.86 [03:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [03:01:37] RECOVERY - db21 Backups SQL on db21 is OK: FILE_AGE OK: /var/log/backup-logs/sql-backup.log is 95 seconds old and 273 bytes [03:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [04:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [04:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.405 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [05:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [06:00:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.403 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [06:08:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.74, 7.41, 7.97 [06:14:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.34, 7.85, 7.95 [06:16:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.53, 7.70, 7.88 [06:18:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.76, 7.84, 7.90 [06:22:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.54, 7.85, 7.89 [06:24:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.08, 7.85, 7.88 [06:30:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.14, 7.62, 7.81 [06:34:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 10.18, 8.34, 8.00 [06:44:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.61, 7.84, 7.96 [06:52:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 9.10, 7.69, 7.72 [07:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [07:40:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.10, 7.64, 7.95 [07:46:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.03, 7.87, 7.94 [07:48:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.35, 7.73, 7.88 [08:00:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.51, 7.78, 7.74 [08:06:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.73, 7.72, 7.85 [08:12:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.03, 7.88, 7.85 [08:14:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.41, 7.84, 7.84 [08:16:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.69, 8.06, 7.91 [08:32:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.94, 7.65, 7.91 [08:38:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.14, 7.86, 7.93 [09:00:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [09:26:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.44, 7.52, 7.97 [09:30:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [09:40:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 9.61, 7.67, 7.49 [09:42:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.94, 7.53, 7.45 [09:46:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.37, 7.98, 7.64 [09:48:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.64, 7.93, 7.67 [10:00:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 8.30, 7.13, 7.21 [10:00:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.405 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [10:06:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 7.07, 7.66, 7.50 [10:08:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 7.96, 8.01, 7.67 [10:10:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 6.81, 7.69, 7.60 [10:14:16] PROBLEM - puppet21 Current Load on puppet21 is CRITICAL: LOAD CRITICAL - total load average: 9.16, 8.12, 7.76 [11:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [11:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.405 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [12:36:56] PROBLEM - mw21 MediaWiki Rendering on mw21 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 2813 bytes in 0.060 second response time [12:37:00] PROBLEM - jobrunner21 MediaWiki Rendering on jobrunner21 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 863 bytes in 0.072 second response time [12:37:10] PROBLEM - mw22 MediaWiki Rendering on mw22 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 863 bytes in 0.062 second response time [12:39:01] PROBLEM - matomo21 HTTPS on matomo21 is CRITICAL: HTTP CRITICAL: HTTP/2 500 - 426 bytes in 0.032 second response time [12:39:18] PROBLEM - db21 MariaDB Connections on db21 is UNKNOWN: PHP Fatal error: Uncaught mysqli_sql_exception: Connection refused in /usr/lib/nagios/plugins/check_mysql_connections.php:47Stack trace:#0 /usr/lib/nagios/plugins/check_mysql_connections.php(47): mysqli_real_connect(Object(mysqli), 'db21.wikitide.n...', 'icinga', Object(SensitiveParameterValue), NULL, NULL, NULL, false)#1 {main} thrown in /usr/lib/nagios/plugins/check_mysql_c [12:39:18] line 47Fatal error: Uncaught mysqli_sql_exception: Connection refused in /usr/lib/nagios/plugins/check_mysql_connections.php:47Stack trace:#0 /usr/lib/nagios/plugins/check_mysql_connections.php(47): mysqli_real_connect(Object(mysqli), 'db21.wikitide.n...', 'icinga', Object(SensitiveParameterValue), NULL, NULL, NULL, false)#1 {main} thrown in /usr/lib/nagios/plugins/check_mysql_connections.php on line 47 [12:39:47] PROBLEM - phorge21 phorge-static.wikitide.org HTTPS on phorge21 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: HTTP/1.1 500 Internal Server Error [12:39:57] PROBLEM - db21 MariaDB on db21 is CRITICAL: Can't connect to server on 'db21.wikitide.net' (115) [12:40:12] PROBLEM - phorge21 issue-tracker.wikitide.org HTTPS on phorge21 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 4328 bytes in 0.018 second response time [12:40:16] PROBLEM - puppet21 Current Load on puppet21 is WARNING: LOAD WARNING - total load average: 3.71, 6.05, 7.65 [12:42:11] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 62% [12:42:49] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 54% [12:44:48] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 33% [12:46:04] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 24% [12:46:16] RECOVERY - puppet21 Current Load on puppet21 is OK: LOAD OK - total load average: 4.33, 4.99, 6.63 [12:53:53] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [12:55:49] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 40% [12:57:46] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 75% [12:59:43] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 53% [13:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [13:01:39] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 81% [13:03:36] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 55% [13:13:19] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [13:15:16] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 16% [13:19:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 62% [13:24:00] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.12, 3.44, 3.98 [13:30:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [13:31:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 37% [13:32:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 23% [13:38:10] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 70% [13:42:00] RECOVERY - mem21 Current Load on mem21 is OK: LOAD OK - total load average: 2.79, 3.12, 3.38 [13:44:51] [02WikiTideOrg/puppet] 07AgentIsai pushed 031 commit to 03master [+24/-23/±22] 13https://github.com/WikiTideOrg/puppet/compare/a79e3528e33e...ff511ab36a50 [13:44:52] [02WikiTideOrg/puppet] 07AgentIsai 03ff511ab - Update back to cloud1 [13:46:30] [02WikiTideOrg/dns] 07AgentIsai pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/dns/compare/06132ea24696...8d23b5a5cb9a [13:46:31] [02WikiTideOrg/dns] 07AgentIsai 038d23b5a - Add bast1-public [13:50:00] PROBLEM - mem21 Current Load on mem21 is WARNING: LOAD WARNING - total load average: 3.64, 3.39, 3.36 [13:51:47] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 10% [13:52:00] RECOVERY - mem21 Current Load on mem21 is OK: LOAD OK - total load average: 2.82, 3.20, 3.30 [13:54:57] PROBLEM - db21 MariaDB on db21 is UNKNOWN: [13:56:55] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 52% [13:57:21] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [13:57:48] PROBLEM - mail21 HTTPS on mail21 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Connection timed out after 10003 milliseconds [13:58:00] PROBLEM - mail21 IMAP on mail21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:58:10] PROBLEM - mail21 webmail.wikitide.net HTTPS on mail21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:58:49] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [13:59:18] PROBLEM - mail21 SMTP on mail21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:59:19] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 58% [14:00:44] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 52% [14:01:35] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: 1 backends are down. mw21 [14:02:38] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 11% [14:03:08] PROBLEM - cp4 Varnish Backends on cp4 is CRITICAL: 1 backends are down. mw21 [14:03:18] PROBLEM - mw21 HTTPS on mw21 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Connection timed out after 10003 milliseconds [14:05:15] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 39% [14:05:28] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 41% [14:06:52] PROBLEM - mem21 memcached on mem21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:07:25] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 93% [14:08:19] PROBLEM - puppet21 HTTPS on puppet21 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Connection timed out after 10004 milliseconds [14:08:21] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 46% [14:08:29] RECOVERY - puppet21 Puppet on puppet21 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [14:09:22] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 19% [14:10:01] PROBLEM - puppet21 puppetserver on puppet21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:10:10] PROBLEM - puppet21 puppetdb on puppet21 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:12:34] PROBLEM - bast21 Puppet on bast21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:14:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 64% [14:15:20] PROBLEM - os21 Puppet on os21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:15:36] PROBLEM - cp4 Puppet on cp4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:16:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 39% [14:17:15] PROBLEM - jobchron21 Puppet on jobchron21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:17:41] PROBLEM - swiftproxy21 Puppet on swiftproxy21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:19:08] PROBLEM - phorge21 Puppet on phorge21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:19:45] PROBLEM - mon21 Puppet on mon21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:21:22] PROBLEM - ldap21 Puppet on ldap21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:21:38] PROBLEM - cloud4 Puppet on cloud4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:23:01] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 46% [14:23:29] PROBLEM - matomo21 Puppet on matomo21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:24:26] PROBLEM - jobrunner21 Puppet on jobrunner21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:24:30] PROBLEM - cp6 Puppet on cp6 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:24:55] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 33% [14:25:28] PROBLEM - mw22 Puppet on mw22 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:25:40] PROBLEM - db21 Puppet on db21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:25:47] PROBLEM - os21 Current Load on os21 is WARNING: LOAD WARNING - total load average: 1.36, 1.60, 1.98 [14:26:42] PROBLEM - ldap21 Current Load on ldap21 is WARNING: LOAD WARNING - total load average: 1.54, 1.62, 1.97 [14:27:37] PROBLEM - mail21 Puppet on mail21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:28:52] PROBLEM - jobchron21 Current Load on jobchron21 is WARNING: LOAD WARNING - total load average: 1.57, 1.62, 1.93 [14:29:09] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 51% [14:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.402 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [14:32:28] PROBLEM - mw21 Puppet on mw21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:33:09] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 17% [14:35:24] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [14:37:16] PROBLEM - mem21 Puppet on mem21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:37:18] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 29% [14:38:02] PROBLEM - bots21 Puppet on bots21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:39:48] PROBLEM - os21 Current Load on os21 is CRITICAL: LOAD CRITICAL - total load average: 2.17, 1.80, 1.81 [14:40:11] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 42% [14:40:29] PROBLEM - puppet21 Puppet on puppet21 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:40:42] PROBLEM - ldap21 Current Load on ldap21 is CRITICAL: LOAD CRITICAL - total load average: 2.15, 1.83, 1.82 [14:40:52] PROBLEM - www.farthestfrontier.wiki - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['cleo.ns.cloudflare.com.', 'may.ns.cloudflare.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [14:40:52] RECOVERY - Host mw1 is UP: PING OK - Packet loss = 0%, RTA = 0.33 ms [14:40:52] PROBLEM - cp5 conntrack_table_size on cp5 is CRITICAL: CHECK_NRPE: Error - Could not connect to 15.235.167.159: Connection reset by peer [14:40:52] RECOVERY - www.rosettacode.org - LetsEncrypt on sslhost is OK: OK - Certificate 'rosettacode.org' will expire on Sat 20 Jan 2024 06:52:33 AM GMT +0000. [14:40:52] RECOVERY - cp6 HTTPS on cp6 is OK: HTTP OK: HTTP/2 200 - 2973 bytes in 1.258 second response time [14:40:52] PROBLEM - jobchron21 Current Load on jobchron21 is CRITICAL: LOAD CRITICAL - total load average: 2.37, 1.88, 1.84 [14:40:53] PROBLEM - swiftproxy1 HTTPS on swiftproxy1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: HTTP/1.1 401 Unauthorized [14:40:53] PROBLEM - cp4 APT on cp4 is CRITICAL: CHECK_NRPE: Error - Could not connect to 146.59.44.171: Connection reset by peer [14:40:55] RECOVERY - Host cp1 is UP: PING OK - Packet loss = 0%, RTA = 0.35 ms [14:40:56] PROBLEM - mon1 NTP time on mon1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.111: Connection reset by peer [14:40:56] RECOVERY - test1 SSH on test1 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u1 (protocol 2.0) [14:40:59] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 1.990 second response time [14:40:59] RECOVERY - Host phorge1 is UP: PING OK - Packet loss = 0%, RTA = 0.34 ms [14:40:59] PROBLEM - cp4 ferm_active on cp4 is CRITICAL: CHECK_NRPE: Error - Could not connect to 146.59.44.171: Connection reset by peer [14:41:00] PROBLEM - cp2 conntrack_table_size on cp2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 51.79.55.151: Connection reset by peer [14:41:01] RECOVERY - mw1 nutcracker process on mw1 is OK: PROCS OK: 1 process with UID = 110 (nutcracker), command name 'nutcracker' [14:41:03] PROBLEM - swiftac1 Puppet on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:41:03] PROBLEM - jobrunner1 MediaWiki Rendering on jobrunner1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.348 second response time [14:41:06] PROBLEM - cp4 conntrack_table_size on cp4 is CRITICAL: CHECK_NRPE: Error - Could not connect to 146.59.44.171: Connection reset by peer [14:41:06] PROBLEM - phorge1 Current Load on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:41:06] RECOVERY - test1 nutcracker port on test1 is OK: TCP OK - 0.000 second response time on 127.0.0.1 port 11212 [14:41:06] RECOVERY - mail1 HTTPS on mail1 is OK: HTTP OK: HTTP/2 301 - 227 bytes in 0.006 second response time [14:41:08] RECOVERY - jobrunner1 HTTPS on jobrunner1 is OK: HTTP OK: HTTP/2 200 - 369 bytes in 0.016 second response time [14:41:09] PROBLEM - cp2 Nginx Backend for phorge1 on cp2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 51.79.55.151: Connection reset by peer [14:41:09] RECOVERY - test1 JobChron Service on test1 is OK: PROCS OK: 1 process with args 'redisJobChronService' [14:41:10] PROBLEM - cp4 Nginx Backend for mw2 on cp4 is CRITICAL: CHECK_NRPE: Error - Could not connect to 146.59.44.171: Connection reset by peer [14:41:10] PROBLEM - cp3 Puppet on cp3 is CRITICAL: CHECK_NRPE: Error - Could not connect to 51.75.170.66: Connection reset by peer [14:41:10] PROBLEM - mem1 Puppet on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:41:11] PROBLEM - phorge1 php-fpm on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:41:12] PROBLEM - cp5 Nginx Backend for puppet1 on cp5 is CRITICAL: CHECK_NRPE: Error - Could not connect to 15.235.167.159: Connection reset by peer [14:41:16] PROBLEM - phorge1 phd on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:41:16] RECOVERY - test1 HTTPS on test1 is OK: HTTP OK: HTTP/2 200 - 364 bytes in 0.058 second response time [14:41:17] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CHECK_NRPE: Error - Could not connect to 51.75.170.66: Connection reset by peer [14:41:18] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2973 bytes in 0.612 second response time [14:41:18] RECOVERY - ns1 Auth DNS on ns1 is OK: DNS OK: 0.079 seconds response time. wikitide.net returns 2607:5300:205:200::2aa8,51.79.55.151 [14:41:19] RECOVERY - www.wikitide.org - reverse DNS on sslhost is OK: SSL OK - www.wikitide.org reverse DNS resolves to cp2.wikitide.net - NS RECORDS OK [14:41:19] PROBLEM - cp5 PowerDNS Recursor on cp5 is CRITICAL: CHECK_NRPE: Error - Could not connect to 15.235.167.159: Connection reset by peer [14:41:20] RECOVERY - mon1 monitoring.wikitide.net HTTPS on mon1 is OK: HTTP OK: HTTP/1.1 302 Found - 298 bytes in 0.089 second response time [14:41:21] PROBLEM - phorge1 Backups Phorge Static on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:41:21] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2973 bytes in 0.225 second response time [14:41:21] RECOVERY - www.rosettacode.org - reverse DNS on sslhost is OK: SSL OK - www.rosettacode.org reverse DNS resolves to cp2.wikitide.net - NS RECORDS OK [14:41:21] RECOVERY - wc.wikitide.org on sslhost is OK: OK - Certificate 'wikitide.org' will expire on Tue 20 Feb 2024 12:20:10 PM GMT +0000. [14:41:22] RECOVERY - Host matomo1 is UP: PING OK - Packet loss = 0%, RTA = 0.43 ms [14:41:22] PROBLEM - cp3 GDNSD Datacenters on cp3 is CRITICAL: CHECK_NRPE: Error - Could not connect to 51.75.170.66: Connection reset by peer [14:41:24] PROBLEM - mon1 IRCEcho on mon1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.111: Connection reset by peer [14:41:24] RECOVERY - cp3 Auth DNS on cp3 is OK: DNS OK: 0.232 seconds response time. wikitide.net returns 2607:5300:205:200::2aa8,51.79.55.151 [14:41:25] PROBLEM - cp6 Varnish Backends on cp6 is CRITICAL: CHECK_NRPE: Error - Could not connect to 139.99.236.151: Connection reset by peer [14:41:26] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.139 second response time [14:41:26] RECOVERY - puppet1 puppetserver on puppet1 is OK: TCP OK - 0.001 second response time on 10.0.0.100 port 8140 [14:41:27] RECOVERY - matomo1 conntrack_table_size on matomo1 is OK: OK: nf_conntrack is 0 % full [14:41:27] PROBLEM - cp5 NTP time on cp5 is CRITICAL: CHECK_NRPE: Error - Could not connect to 15.235.167.159: Connection reset by peer [14:41:28] PROBLEM - puppet1 HTTPS on puppet1 is WARNING: HTTP WARNING: HTTP/2 403 - 113 bytes in 0.012 second response time [14:41:29] RECOVERY - cp4 HTTPS on cp4 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.717 second response time [14:41:29] RECOVERY - mon1 grafana.wikitide.net HTTPS on mon1 is OK: HTTP OK: HTTP/1.1 200 OK - 43178 bytes in 0.028 second response time [14:41:31] RECOVERY - test1 NTP time on test1 is OK: NTP OK: Offset 0.004355311394 secs [14:41:32] PROBLEM - jobchron1 Puppet on jobchron1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [14:41:33] PROBLEM - mem1 ferm_active on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:41:36] RECOVERY - test1 conntrack_table_size on test1 is OK: OK: nf_conntrack is 0 % full [14:41:40] PROBLEM - mem1 Disk Space on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:41:51] PROBLEM - bots1 APT on bots1 is CRITICAL: APT CRITICAL: 30 packages available for upgrade (3 critical updates). [14:41:53] PROBLEM - swiftac1 conntrack_table_size on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:41:57] PROBLEM - prometheus1 Disk Space on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:41:59] PROBLEM - services1 NTP time on services1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.115: Connection reset by peer [14:42:02] PROBLEM - prometheus1 PowerDNS Recursor on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:42:09] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 24% [14:42:13] PROBLEM - services1 conntrack_table_size on services1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.115: Connection reset by peer [14:42:22] PROBLEM - graylog1 APT on graylog1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.117: Connection reset by peer [14:42:25] PROBLEM - mem1 NTP time on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:42:25] PROBLEM - jobchron1 APT on jobchron1 is CRITICAL: APT CRITICAL: 50 packages available for upgrade (3 critical updates). [14:42:26] PROBLEM - puppet1 NTP time on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:42:26] PROBLEM - puppet1 Backups SSLKeys on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:42:31] PROBLEM - swiftac1 APT on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:42:31] RECOVERY - test1 nutcracker process on test1 is OK: PROCS OK: 1 process with UID = 110 (nutcracker), command name 'nutcracker' [14:42:31] RECOVERY - matomo1 PowerDNS Recursor on matomo1 is OK: DNS OK: 0.029 seconds response time. wikitide.org returns 2607:5300:205:200::2aa8,51.79.55.151 [14:42:32] PROBLEM - mem1 APT on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:42:34] PROBLEM - services1 PowerDNS Recursor on services1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.115: Connection reset by peer [14:42:34] PROBLEM - puppet1 PowerDNS Recursor on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:42:35] PROBLEM - swiftac1 PowerDNS Recursor on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:42:36] RECOVERY - matomo1 HTTPS on matomo1 is OK: HTTP OK: HTTP/2 200 - 553 bytes in 0.441 second response time RECOVERY - mw2 SSH on mw2 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u1 (protocol 2.0) [14:42:36] RECOVERY - mw1 PowerDNS Recursor on mw1 is OK: DNS OK: 0.033 seconds response time. wikitide.org returns 2607:5300:205:200::2aa8,51.79.55.151 [14:42:36] RECOVERY - test1 php-fpm on test1 is OK: PROCS OK: 8 processes with command name 'php-fpm8.2' [14:42:36] RECOVERY - test1 Disk Space on test1 is OK: DISK OK - free space: / 17792MiB (44% inode=67%); [14:42:37] PROBLEM - swiftac1 ferm_active on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:42:38] PROBLEM - mon1 Disk Space on mon1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.111: Connection reset by peer [14:42:38] PROBLEM - puppet1 Current Load on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:42:39] PROBLEM - swiftac1 NTP time on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:42:40] PROBLEM - swiftac1 Disk Space on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:42:40] PROBLEM - os1 APT on os1 is CRITICAL: APT CRITICAL: 31 packages available for upgrade (3 critical updates). [14:42:41] PROBLEM - swiftproxy1 APT on swiftproxy1 is CRITICAL: APT CRITICAL: 29 packages available for upgrade (3 critical updates). [14:42:41] RECOVERY - phorge1 issue-tracker.wikitide.org HTTPS on phorge1 is OK: HTTP OK: HTTP/1.1 200 OK - 110931 bytes in 0.143 second response time [14:42:41] PROBLEM - phorge1 Disk Space on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:42:41] RECOVERY - test1 ferm_active on test1 is OK: OK ferm input default policy is set [14:42:41] PROBLEM - phorge1 conntrack_table_size on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:42:42] PROBLEM - graylog1 Current Load on graylog1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.117: Connection reset by peer [14:42:43] PROBLEM - mem1 PowerDNS Recursor on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:42:46] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.106 second response time [14:42:48] PROBLEM - mem1 conntrack_table_size on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:42:51] PROBLEM - phorge1 PowerDNS Recursor on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:42:51] PROBLEM - phorge1 APT on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:42:51] PROBLEM - phorge1 NTP time on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer PROBLEM - phorge1 ferm_active on phorge1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.108: Connection reset by peer [14:42:52] PROBLEM - graylog1 NTP time on graylog1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.117: Connection reset by peer [14:42:52] PROBLEM - bast1 APT on bast1 is CRITICAL: APT CRITICAL: 30 packages available for upgrade (3 critical updates). [14:42:55] PROBLEM - mon1 ferm_active on mon1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.111: Connection reset by peer [14:42:55] PROBLEM - puppet1 Backups Private on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:42:56] PROBLEM - prometheus1 conntrack_table_size on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:42:57] PROBLEM - puppet1 ferm_active on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:42:59] PROBLEM - mon1 php-fpm on mon1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.111: Connection reset by peer [14:43:01] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 46% [14:43:02] PROBLEM - mail1 APT on mail1 is CRITICAL: APT CRITICAL: 60 packages available for upgrade (3 critical updates). [14:43:04] PROBLEM - prometheus1 APT on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:43:05] PROBLEM - ldap1 APT on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:43:07] PROBLEM - prometheus1 ferm_active on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:43:07] PROBLEM - prometheus1 Current Load on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:43:07] PROBLEM - graylog1 ferm_active on graylog1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.117: Connection reset by peer [14:43:08] PROBLEM - mem1 Current Load on mem1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.106: Connection reset by peer [14:43:09] PROBLEM - services1 ferm_active on services1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.115: Connection reset by peer [14:43:11] PROBLEM - prometheus1 NTP time on prometheus1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.118: Connection reset by peer [14:43:13] PROBLEM - services1 Disk Space on services1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.115: Connection reset by peer [14:43:13] PROBLEM - swiftac1 Current Load on swiftac1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.120: Connection reset by peer [14:43:15] PROBLEM - services1 Current Load on services1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.115: Connection reset by peer [14:43:16] RECOVERY - matomo1 SSH on matomo1 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u1 (protocol 2.0) [14:43:16] PROBLEM - puppet1 conntrack_table_size on puppet1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.100: Connection reset by peer [14:43:16] PROBLEM - swiftobject1 APT on swiftobject1 is CRITICAL: APT CRITICAL: 29 packages available for upgrade (3 critical updates). [14:43:18] PROBLEM - graylog1 PowerDNS Recursor on graylog1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.117: Connection reset by peer [14:43:21] PROBLEM - graylog1 conntrack_table_size on graylog1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.117: Connection reset by peer [14:43:22] PROBLEM - db1 APT on db1 is CRITICAL: APT CRITICAL: 29 packages available for upgrade (3 critical updates). [14:43:26] PROBLEM - ns1 APT on ns1 is CRITICAL: APT CRITICAL: 30 packages available for upgrade (3 critical updates). [14:44:47] PROBLEM - ldap1 Current Load on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:44:55] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 70% [14:45:00] PROBLEM - ldap1 NTP time on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:45:01] PROBLEM - ldap1 Disk Space on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:45:07] PROBLEM - ldap1 conntrack_table_size on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:45:19] PROBLEM - ldap1 PowerDNS Recursor on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:45:24] PROBLEM - ldap1 ferm_active on ldap1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.109: Connection reset by peer [14:46:10] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 78% [14:46:24] PROBLEM - swiftobject1 ferm_active on swiftobject1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.121: Connection reset by peer [14:46:31] PROBLEM - mail1 php-fpm on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:46:37] PROBLEM - mail1 PowerDNS Recursor on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:46:43] PROBLEM - mail1 ferm_active on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:46:49] PROBLEM - swiftobject1 conntrack_table_size on swiftobject1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.121: Connection reset by peer [14:46:49] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 39% [14:46:56] PROBLEM - mail1 NTP time on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:46:58] PROBLEM - mail1 conntrack_table_size on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:46:59] PROBLEM - mail1 Disk Space on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:46:59] PROBLEM - swiftobject1 PowerDNS Recursor on swiftobject1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.121: Connection reset by peer [14:47:02] PROBLEM - swiftobject1 NTP time on swiftobject1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.121: Connection reset by peer [14:47:02] PROBLEM - swiftobject1 Disk Space on swiftobject1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.121: Connection reset by peer [14:47:09] PROBLEM - swiftobject1 Current Load on swiftobject1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.121: Connection reset by peer [14:47:23] PROBLEM - mail1 Current Load on mail1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.110: Connection reset by peer [14:48:08] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 38% [14:49:01] PROBLEM - jobchron1 Disk Space on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:49:09] PROBLEM - jobchron1 Current Load on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:49:20] PROBLEM - jobchron1 PowerDNS Recursor on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:49:34] PROBLEM - jobchron1 NTP time on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:50:32] PROBLEM - bots1 conntrack_table_size on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:50:37] PROBLEM - jobchron1 conntrack_table_size on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:50:43] PROBLEM - jobchron1 JobChron Service on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:50:46] PROBLEM - bots1 Disk Space on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:50:51] PROBLEM - jobchron1 ferm_active on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:50:54] PROBLEM - bots1 PowerDNS Recursor on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:50:57] PROBLEM - jobchron1 poolcounter process on jobchron1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.105: Connection reset by peer [14:51:01] PROBLEM - bots1 Current Load on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:51:03] PROBLEM - bots1 ferm_active on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:51:09] PROBLEM - bots1 IRC Log Bot on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:51:26] PROBLEM - bots1 NTP time on bots1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.123: Connection reset by peer [14:52:09] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [14:52:14] PROBLEM - db1 Backups SQL on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:52:17] PROBLEM - matomo1 PowerDNS Recursor on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:52:19] PROBLEM - db1 Backups SQL wtglobal on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:52:33] PROBLEM - db1 conntrack_table_size on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:52:34] PROBLEM - matomo1 php-fpm on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:52:38] PROBLEM - cp3 Puppet on cp3 is WARNING: WARNING: Puppet last ran 1 hour ago [14:52:40] PROBLEM - db1 ferm_active on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:52:43] PROBLEM - db1 PowerDNS Recursor on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:52:50] PROBLEM - matomo1 HTTPS on matomo1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10002 milliseconds with 0 bytes received [14:52:57] PROBLEM - matomo1 ferm_active on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:52:58] PROBLEM - matomo1 Disk Space on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:53:07] PROBLEM - matomo1 NTP time on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:53:08] PROBLEM - matomo1 Current Load on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:53:09] PROBLEM - db1 Current Load on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:53:09] PROBLEM - matomo1 conntrack_table_size on matomo1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.119: Connection reset by peer [14:53:15] PROBLEM - db1 Disk Space on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:53:18] PROBLEM - db1 NTP time on db1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.103: Connection reset by peer [14:53:29] PROBLEM - cp2 Puppet on cp2 is WARNING: WARNING: Puppet last ran 1 hour ago [14:53:48] PROBLEM - os21 Current Load on os21 is WARNING: LOAD WARNING - total load average: 1.51, 1.98, 1.99 [14:54:16] PROBLEM - bast1 Disk Space on bast1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.114: Connection reset by peer [14:54:18] PROBLEM - bast1 ferm_active on bast1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.114: Connection reset by peer [14:54:18] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 864 bytes in 0.204 second response time [14:54:25] PROBLEM - ns1 SSH on ns1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:54:30] PROBLEM - ns1 NTP time on ns1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 63.141.240.4: Connection reset by peer [14:54:36] PROBLEM - bast1 PowerDNS Recursor on bast1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.114: Connection reset by peer [14:54:41] PROBLEM - bast1 NTP time on bast1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.114: Connection reset by peer [14:54:41] PROBLEM - ns1 Current Load on ns1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 63.141.240.4: Connection reset by peer [14:54:43] PROBLEM - ns1 Disk Space on ns1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 63.141.240.4: Connection reset by peer [14:54:45] PROBLEM - bast1 Current Load on bast1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.114: Connection reset by peer [14:54:49] PROBLEM - bast1 conntrack_table_size on bast1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.114: Connection reset by peer [14:54:52] PROBLEM - jobchron21 Current Load on jobchron21 is WARNING: LOAD WARNING - total load average: 1.78, 2.00, 2.00 [14:55:10] PROBLEM - ns1 ferm_active on ns1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 63.141.240.4: Connection reset by peer [14:55:11] PROBLEM - ns1 conntrack_table_size on ns1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 63.141.240.4: Connection reset by peer [14:55:52] PROBLEM - cp5 Puppet on cp5 is WARNING: WARNING: Puppet last ran 1 hour ago [14:56:06] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 49% [14:56:14] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.114 second response time [14:56:42] PROBLEM - ldap21 Current Load on ldap21 is WARNING: LOAD WARNING - total load average: 1.70, 1.90, 1.96 [14:56:50] RECOVERY - matomo1 HTTPS on matomo1 is OK: HTTP OK: HTTP/2 200 - 553 bytes in 0.911 second response time [14:57:23] PROBLEM - os1 Disk Space on os1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.116: Connection reset by peer [14:57:47] PROBLEM - os21 Current Load on os21 is CRITICAL: LOAD CRITICAL - total load average: 2.56, 2.08, 2.02 [14:58:00] PROBLEM - os1 PowerDNS Recursor on os1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.116: Connection reset by peer [14:58:20] PROBLEM - os1 conntrack_table_size on os1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.116: Connection reset by peer [14:58:28] PROBLEM - os1 ferm_active on os1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.116: Connection reset by peer [14:58:35] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 44% [14:58:42] PROBLEM - ldap21 Current Load on ldap21 is CRITICAL: LOAD CRITICAL - total load average: 2.44, 2.14, 2.04 [14:58:43] PROBLEM - os1 Current Load on os1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.116: Connection reset by peer [14:58:44] PROBLEM - os1 HTTPS on os1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [14:58:52] PROBLEM - jobchron21 Current Load on jobchron21 is CRITICAL: LOAD CRITICAL - total load average: 2.39, 2.14, 2.04 [14:59:29] PROBLEM - os1 NTP time on os1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.116: Connection reset by peer [15:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [15:01:24] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:01:58] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 69% [15:02:24] PROBLEM - swiftproxy1 ferm_active on swiftproxy1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.122: Connection reset by peer [15:02:26] PROBLEM - swiftproxy1 PowerDNS Recursor on swiftproxy1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.122: Connection reset by peer [15:02:37] PROBLEM - swiftproxy1 Disk Space on swiftproxy1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.122: Connection reset by peer [15:02:46] PROBLEM - swiftproxy1 conntrack_table_size on swiftproxy1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.122: Connection reset by peer [15:03:07] PROBLEM - swiftproxy1 Current Load on swiftproxy1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.122: Connection reset by peer [15:03:15] PROBLEM - swiftproxy1 NTP time on swiftproxy1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.122: Connection reset by peer [15:03:21] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.111 second response time [15:03:56] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 29% [15:06:35] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 37% [15:10:06] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 45% [15:12:03] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 39% [15:13:35] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 67% [15:15:29] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 40% [15:17:24] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [15:21:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 53% [15:23:06] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 64% [15:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.406 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [15:30:43] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 32% [15:33:33] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 42% [15:37:49] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [15:39:30] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [15:39:48] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 45% [15:42:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 77% [15:43:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 76% [15:44:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 35% [15:46:13] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 53% [15:49:25] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 40% [15:50:06] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [15:50:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 50% [15:52:03] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 46% [15:53:32] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 24% [15:56:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [15:58:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 51% [16:00:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [16:01:46] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 26% [16:03:19] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 36% [16:04:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 21% [16:04:28] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 42% [16:06:24] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 70% [16:07:38] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 40% [16:08:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 57% [16:08:21] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 44% [16:10:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 36% [16:10:19] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 36% [16:11:31] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 28% [16:16:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 75% [16:16:14] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 46% [16:17:24] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 41% [16:18:11] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 25% [16:19:20] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 30% [16:20:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 57% [16:22:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 30% [16:28:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [16:30:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 34% [16:30:32] uh, I think something is broken. [16:30:32] uh, I think something is broken. [16:34:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 81% [16:35:04] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 41% [16:39:02] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [16:40:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 58% [16:43:16] @reception123 [16:43:16] @reception123 [16:43:38] (my god it duplicated the ping. the relay probably needs fixed too) [16:43:38] (my god it duplicated the ping. the relay probably needs fixed too) [16:45:48] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 46% [16:47:44] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 29% [16:51:46] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 45% [16:52:55] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 41% [16:53:42] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [16:54:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 36% [16:54:46] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 43% [16:55:40] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 16% [16:56:18] I was told to ping all @Site Reliability Engineers so [16:56:18] I was told to ping all @Site Reliability Engineers so [16:56:43] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 34% [16:56:53] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [16:59:56] The IRC seems to be high this time [16:59:56] The IRC seems to be high this time [16:59:59] lol [16:59:59] lol [17:05:35] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 55% [17:06:48] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 41% [17:07:41] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 69% [17:08:30] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 50% [17:08:47] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [17:09:32] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [17:09:35] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 43% [17:11:28] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 19% [17:12:23] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 61% [17:13:23] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 34% [17:16:17] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 35% [17:17:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 71% [17:19:07] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 43% [17:20:12] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 53% [17:22:09] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 66% [17:22:55] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 67% [17:24:06] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 36% [17:24:39] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 40% [17:26:38] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [17:26:43] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 55% [17:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.404 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [17:32:26] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 33% [17:33:54] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 53% [17:35:50] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 38% [17:36:15] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 54% [17:38:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [17:44:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 34% [17:45:38] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 54% [17:48:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 53% [17:52:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [17:53:25] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 76% [17:54:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 38% [17:55:21] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 48% [18:01:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 37% [18:04:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 50% [18:05:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [18:08:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 30% [18:09:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 52% [18:13:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 38% [18:16:07] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:16:55] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 73% [18:17:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 67% [18:17:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [18:18:22] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [18:18:49] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 58% [18:19:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 42% [18:19:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [18:20:18] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 6.928 second response time [18:20:44] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 62% [18:21:12] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [18:21:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [18:21:19] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 45% [18:21:49] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw1 [18:22:29] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:23:10] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.685 second response time [18:23:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [18:23:50] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 56% [18:24:30] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 2.412 second response time [18:26:33] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10001 milliseconds with 0 bytes received [18:28:20] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:30:14] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [18:30:15] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 48% [18:30:17] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.103 second response time [18:30:26] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.562 second response time [18:31:05] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 7.187 second response time [18:31:05] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 3.525 second response time [18:31:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [18:31:35] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 16% [18:31:37] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [18:33:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 28% [18:34:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 69% [18:38:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 48% [18:40:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 76% [18:43:10] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [18:45:18] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 43% [18:46:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 50% [18:47:15] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 27% [18:51:05] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 57% [18:51:55] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 64% [18:52:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 37% [18:53:52] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 39% [18:57:00] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 64% [18:58:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 51% [18:58:56] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 27% [19:00:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [19:01:01] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 38% [19:02:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 18% [19:04:58] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 41% [19:08:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 58% [19:10:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 69% [19:10:56] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 37% [19:12:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 54% [19:12:20] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:14:08] PROBLEM - cp3 APT on cp3 is CRITICAL: APT CRITICAL: 3 packages available for upgrade (3 critical updates). [19:14:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 95% [19:14:25] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 67% [19:14:52] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [19:15:00] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.739 second response time [19:15:01] PROBLEM - mw1 HTTPS on mw1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [19:15:29] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 50% [19:15:58] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.050 second response time [19:15:59] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.246 second response time [19:16:01] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [19:16:05] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [19:16:11] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:16:56] RECOVERY - mw1 HTTPS on mw1 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 1.271 second response time [19:17:24] PROBLEM - swiftproxy21 APT on swiftproxy21 is CRITICAL: APT CRITICAL: 30 packages available for upgrade (3 critical updates). [19:17:25] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 39% [19:17:27] PROBLEM - cp2 APT on cp2 is CRITICAL: APT CRITICAL: 32 packages available for upgrade (3 critical updates). [19:17:37] PROBLEM - phorge21 APT on phorge21 is CRITICAL: APT CRITICAL: 20 packages available for upgrade (3 critical updates). [19:17:58] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [19:18:00] PROBLEM - mon21 APT on mon21 is CRITICAL: APT CRITICAL: 64 packages available for upgrade (3 critical updates). [19:18:18] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 58% [19:18:51] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 52% [19:18:55] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 1.347 second response time [19:19:16] PROBLEM - mail21 APT on mail21 is CRITICAL: APT CRITICAL: 59 packages available for upgrade (3 critical updates). [19:19:55] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [19:19:58] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 0.014 second response time [19:19:59] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.166 second response time [19:20:03] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.571 second response time [19:20:05] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [19:20:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 18% [19:20:15] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.115 second response time [19:20:23] PROBLEM - cp6 APT on cp6 is CRITICAL: APT CRITICAL: 32 packages available for upgrade (3 critical updates). [19:20:34] PROBLEM - matomo21 APT on matomo21 is CRITICAL: APT CRITICAL: 22 packages available for upgrade (3 critical updates). [19:20:42] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.110 second response time [19:20:46] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [19:22:05] PROBLEM - cp5 APT on cp5 is CRITICAL: APT CRITICAL: 31 packages available for upgrade (3 critical updates). [19:22:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 23% [19:22:25] PROBLEM - db21 APT on db21 is CRITICAL: APT CRITICAL: 4 packages available for upgrade (3 critical updates). [19:23:37] PROBLEM - bots21 APT on bots21 is CRITICAL: APT CRITICAL: 30 packages available for upgrade (3 critical updates). [19:25:59] PROBLEM - cp4 APT on cp4 is CRITICAL: APT CRITICAL: 32 packages available for upgrade (3 critical updates). [19:26:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 57% [19:27:20] PROBLEM - bast21 APT on bast21 is CRITICAL: APT CRITICAL: 30 packages available for upgrade (3 critical updates). [19:28:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 28% [19:30:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [19:32:56] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 49% [19:36:43] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 38% [19:36:49] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 28% [19:40:43] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 70% [19:42:38] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 58% [19:46:35] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 56% [19:48:32] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [19:50:14] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [19:50:28] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 38% [19:52:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 54% [19:54:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 34% [20:00:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 47% [20:04:07] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 42% [20:06:04] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 30% [20:06:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 24% [20:08:07] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:10:00] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 53% [20:10:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 58% [20:10:13] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 9.087 second response time [20:10:21] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:11:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [20:12:05] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [20:12:20] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:13:38] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10002 milliseconds with 0 bytes received [20:13:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 61% [20:13:53] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 83% [20:14:08] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.051 second response time [20:14:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 94% [20:14:27] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 9.959 second response time [20:15:22] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10000 milliseconds with 0 bytes received [20:15:40] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 40% [20:15:43] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [20:15:49] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 53% [20:17:26] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 45% [20:17:40] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [20:17:46] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 36% [20:18:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 39% [20:18:39] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:19:46] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.549 second response time [20:20:09] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 3.976 second response time [20:21:16] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 1.380 second response time [20:21:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [20:22:53] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 3.709 second response time [20:22:53] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 9.807 second response time [20:23:26] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 22% [20:25:38] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 50% [20:25:39] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [20:25:57] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [20:27:36] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [20:27:57] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.559 second response time [20:28:05] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [20:28:57] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:29:05] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.401 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [20:30:55] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 1.887 second response time [20:31:08] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 5.291 second response time [20:32:05] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [20:33:02] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:33:09] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 40% [20:33:24] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 37% [20:33:33] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [20:35:06] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 22% [20:35:14] PROBLEM - mw1 HTTPS on mw1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [20:37:12] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:37:12] RECOVERY - mw1 HTTPS on mw1 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 4.047 second response time [20:37:12] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 6.084 second response time [20:37:17] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [20:37:20] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [20:38:05] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [20:39:10] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.110 second response time [20:39:17] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 52% [20:39:19] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:41:31] PROBLEM - mw1 HTTPS on mw1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [20:43:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 56% [20:43:14] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 43% [20:43:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 76% [20:43:16] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:43:23] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.116 second response time [20:43:25] RECOVERY - mw1 HTTPS on mw1 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 0.015 second response time [20:45:18] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [20:45:23] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 9.349 second response time [20:45:30] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:46:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [20:47:01] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 24% [20:47:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 45% [20:47:34] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 7.125 second response time [20:48:05] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [20:49:17] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [20:50:05] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [20:51:10] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [20:51:15] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [20:51:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 73% [20:52:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 52% [20:53:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 47% [20:53:36] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:54:44] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 29% [20:55:39] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 6.607 second response time [20:57:07] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 42% [20:57:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [20:58:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 80% [20:59:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 50% [20:59:38] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:00:13] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [21:00:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 47% [21:00:45] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [21:01:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 62% [21:01:57] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10002 milliseconds with 0 bytes received [21:03:01] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [21:03:04] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [21:03:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 31% [21:03:44] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.123 second response time [21:04:05] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 8.415 second response time [21:04:39] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [21:04:45] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 38% [21:05:00] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [21:05:01] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [21:06:12] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.569 second response time [21:06:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [21:07:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 56% [21:08:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 28% [21:08:56] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [21:09:00] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw1 [21:09:12] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.692 second response time [21:09:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [21:09:22] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [21:10:16] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.246 second response time [21:10:28] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [21:11:01] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 38% [21:11:48] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:12:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 74% [21:12:42] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [21:12:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 43% [21:13:12] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 1.371 second response time [21:13:14] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.167 second response time [21:13:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 56% [21:14:09] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.555 second response time [21:14:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 57% [21:14:22] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [21:14:42] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 0.060 second response time [21:14:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [21:14:53] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [21:14:53] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [21:15:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [21:16:01] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 6.451 second response time [21:16:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 53% [21:18:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 72% [21:18:49] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [21:18:52] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [21:19:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 56% [21:20:48] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [21:20:49] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [21:21:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 69% [21:22:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 16% [21:22:45] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 32% [21:23:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 35% [21:25:17] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:26:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 49% [21:27:17] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 3.072 second response time [21:28:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 66% [21:29:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 72% [21:29:40] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [21:29:46] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [21:29:50] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.070 second response time [21:29:53] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [21:30:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [21:32:00] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:32:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 54% [21:33:56] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.166 second response time [21:34:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 13% [21:34:48] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 42% [21:35:15] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:36:07] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.769 second response time [21:36:44] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 34% [21:38:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 83% [21:38:15] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:39:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 58% [21:40:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 20% [21:40:19] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 5.596 second response time [21:40:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 44% [21:41:33] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 7.856 second response time [21:43:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 89% [21:43:29] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [21:43:32] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [21:43:33] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [21:44:11] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10000 milliseconds with 0 bytes received [21:44:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 52% [21:44:25] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:44:44] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [21:45:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 37% [21:45:46] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:45:59] PROBLEM - mw1 HTTPS on mw1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10002 milliseconds with 0 bytes received [21:46:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 25% [21:46:26] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 3.155 second response time [21:46:27] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.247 second response time [21:47:28] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw1 [21:47:28] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [21:47:49] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 5.845 second response time [21:47:54] RECOVERY - mw1 HTTPS on mw1 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 0.014 second response time [21:47:58] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10002 milliseconds with 0 bytes received [21:49:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 67% [21:50:40] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 41% [21:51:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [21:53:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 55% [21:54:22] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 5.786 second response time [21:54:44] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 38% [21:54:55] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 42% [21:56:27] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 1.401 second response time [21:57:22] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [21:57:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [21:57:51] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:58:43] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 25% [21:59:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 19% [21:59:48] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.113 second response time [22:00:21] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.564 second response time [22:00:55] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.065 second response time [22:01:18] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [22:01:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw1 [22:01:54] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:02:51] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.160 second response time [22:03:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 46% [22:05:07] PROBLEM - mw1 HTTPS on mw1 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [22:05:24] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 75% [22:06:04] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 5.825 second response time [22:07:02] RECOVERY - mw1 HTTPS on mw1 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 1.009 second response time [22:07:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 70% [22:07:18] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 32% [22:07:40] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.684 second response time [22:08:11] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:08:40] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [22:08:42] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:09:13] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [22:09:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 55% [22:10:35] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [22:10:52] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [22:12:48] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.183 second response time [22:12:50] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 2.836 second response time [22:13:18] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 45% [22:14:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 98% [22:14:29] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 7.662 second response time [22:14:39] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.567 second response time [22:14:58] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:15:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 66% [22:16:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 18% [22:17:07] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw2 [22:17:42] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 1.384 second response time [22:18:40] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:19:07] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [22:19:13] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 9.085 second response time [22:19:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 39% [22:19:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [22:21:21] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:21:52] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.039 second response time [22:22:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [22:22:41] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [22:22:59] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 42% [22:23:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [22:23:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [22:24:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 48% [22:24:14] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [22:24:35] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 42% [22:24:41] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 0.014 second response time [22:24:53] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 2.516 second response time [22:24:56] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 39% [22:25:02] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [22:25:29] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 1.616 second response time [22:25:30] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.700 second response time [22:26:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 72% [22:27:00] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:27:25] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/ff511ab36a50...f166cc5756e0 [22:27:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [22:27:27] [02WikiTideOrg/puppet] 07Universal-Omega 03f166cc5 - Use swiftproxy1 [22:28:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 33% [22:28:46] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [22:28:59] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [22:29:35] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:30:00] [02WikiTideOrg/dns] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/dns/compare/8d23b5a5cb9a...641209bc3c49 [22:30:01] [02WikiTideOrg/dns] 07Universal-Omega 03641209b - Update services [22:30:23] [02WikiTideOrg/dns] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/dns/compare/641209bc3c49...4cd0df2f5cc4 [22:30:24] [02WikiTideOrg/dns] 07Universal-Omega 034cd0df2 - Update [22:30:34] PROBLEM - newcascadia.net - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.405 seconds: Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out.; Server 1.1.1.1 UDP port 53 answered The DNS operation timed out. [22:30:52] PROBLEM - jobchron21 Current Load on jobchron21 is WARNING: LOAD WARNING - total load average: 1.60, 1.87, 2.00 [22:30:57] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [22:31:04] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.112 second response time [22:31:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 51% [22:31:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 1 backends are down. mw2 [22:31:36] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 3.665 second response time [22:32:19] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.566 second response time [22:32:42] PROBLEM - ldap21 Current Load on ldap21 is WARNING: LOAD WARNING - total load average: 1.94, 1.93, 2.00 [22:32:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 41% [22:32:47] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw2 [22:32:54] [02WikiTideOrg/mw-config] 07Universal-Omega pushed 031 commit to 03main [+0/-0/±1] 13https://github.com/WikiTideOrg/mw-config/compare/a9e9deb06f2a...182f6901c0d0 [22:32:56] [02WikiTideOrg/mw-config] 07Universal-Omega 03182f690 - Update jobchron IP [22:33:44] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:33:47] PROBLEM - os21 Current Load on os21 is WARNING: LOAD WARNING - total load average: 1.40, 1.80, 1.96 [22:33:48] WikiTideOrg/mw-config - Universal-Omega the build passed. [22:34:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 76% [22:34:42] PROBLEM - ldap21 Current Load on ldap21 is CRITICAL: LOAD CRITICAL - total load average: 2.04, 1.94, 2.00 [22:34:54] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [22:35:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 77% [22:35:47] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 5.047 second response time [22:36:19] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 3.657 second response time [22:36:42] PROBLEM - ldap21 Current Load on ldap21 is WARNING: LOAD WARNING - total load average: 1.92, 1.92, 1.98 [22:36:46] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [22:36:52] PROBLEM - jobchron21 Current Load on jobchron21 is CRITICAL: LOAD CRITICAL - total load average: 2.18, 1.98, 2.00 [22:37:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 46% [22:37:48] PROBLEM - os21 Current Load on os21 is CRITICAL: LOAD CRITICAL - total load average: 2.24, 1.99, 2.00 [22:38:39] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10003 milliseconds with 0 bytes received [22:38:42] PROBLEM - ldap21 Current Load on ldap21 is CRITICAL: LOAD CRITICAL - total load average: 2.49, 2.11, 2.04 [22:39:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 31% [22:40:47] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw1 [22:41:07] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/f166cc5756e0...ae7167b68938 [22:41:08] [02WikiTideOrg/puppet] 07Universal-Omega 03ae7167b - Fix IP [22:41:58] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:42:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 28% [22:42:44] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 37% [22:42:46] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [22:43:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 58% [22:43:55] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 7.973 second response time [22:44:04] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 7.887 second response time [22:44:38] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2952 bytes in 0.546 second response time [22:45:11] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:45:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 76% [22:46:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 97% [22:46:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 77% [22:47:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 51% [22:47:15] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 6.943 second response time [22:47:41] PROBLEM - mw2 HTTPS on mw2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 28 - Operation timed out after 10004 milliseconds with 0 bytes received [22:47:55] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.733 second response time [22:48:15] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:48:52] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.249 second response time [22:49:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [22:49:50] RECOVERY - mw2 HTTPS on mw2 is OK: HTTP OK: HTTP/2 200 - 362 bytes in 9.490 second response time [22:50:13] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.157 second response time [22:51:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 55% [22:51:19] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:52:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 49% [22:52:55] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 0.703 second response time [22:53:46] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [22:54:35] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 39% [22:55:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [22:56:44] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 64% [22:57:37] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 7.422 second response time [22:57:48] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.038 second response time [22:58:32] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:59:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 68% [22:59:27] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [22:59:43] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 2.075 second response time [22:59:44] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 0.167 second response time [23:00:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 44% [23:00:35] PROBLEM - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is WARNING: WARNING - NGINX Error Rate is 40% [23:00:37] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 6.822 second response time [23:00:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 45% [23:02:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 100% [23:02:31] PROBLEM - mw1 Current Load on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:02:31] PROBLEM - mw1 conntrack_table_size on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:02:42] PROBLEM - mw1 Disk Space on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:02:42] PROBLEM - mw1 php-fpm on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:02:46] PROBLEM - mw1 nutcracker process on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:03:05] PROBLEM - mw1 PowerDNS Recursor on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:03:21] PROBLEM - mw1 NTP time on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:03:43] PROBLEM - cp5 HTTPS on cp5 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.694 second response time [23:03:50] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.105 second response time [23:04:01] PROBLEM - cp3 HTTPS on cp3 is CRITICAL: HTTP CRITICAL: HTTP/2 503 - 2628 bytes in 0.253 second response time [23:04:02] PROBLEM - mw1 ferm_active on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:04:03] PROBLEM - mw1 nutcracker port on mw1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.101: Connection reset by peer [23:04:27] PROBLEM - mw2 conntrack_table_size on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:04:35] PROBLEM - mw2 nutcracker port on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:04:35] RECOVERY - cp4 HTTP 4xx/5xx ERROR Rate on cp4 is OK: OK - NGINX Error Rate is 37% [23:04:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is CRITICAL: CRITICAL - NGINX Error Rate is 61% [23:04:49] PROBLEM - mw2 NTP time on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:05:10] PROBLEM - mw2 ferm_active on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:05:14] PROBLEM - mw2 nutcracker process on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:05:16] PROBLEM - mw2 php-fpm on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:05:26] PROBLEM - mw2 Disk Space on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:05:55] PROBLEM - mw2 PowerDNS Recursor on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:06:12] PROBLEM - mw2 Current Load on mw2 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.102: Connection reset by peer [23:10:51] PROBLEM - wiki.chevrine.com - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns3.digitalocean.com.', 'ns1.digitalocean.com.', 'ns2.digitalocean.com.'], 'CNAME': 'mw-lb.wikitide.org.'} [23:10:55] PROBLEM - hsck.lophocmatngu.wiki - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['lia.ns.cloudflare.com.', 'gerald.ns.cloudflare.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:03] PROBLEM - meta.sagan4.org - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns59.domaincontrol.com.', 'ns60.domaincontrol.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:04] PROBLEM - nexttide.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - nexttide.org All nameservers failed to answer the query. [23:11:05] PROBLEM - newcascadia.net - LetsEncrypt on sslhost is CRITICAL: CRITICAL - Certificate 'newcascadia.net' expired on Tue 14 Nov 2023 11:52:59 PM GMT +0000. [23:11:06] PROBLEM - www.lgbtqia.wiki - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['grannbo.ns.cloudflare.com.', 'rodrigo.ns.cloudflare.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:09] PROBLEM - wiki.myehs.eu - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['fred.ns.cloudflare.com.', 'ivy.ns.cloudflare.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:17] PROBLEM - wikitide.com - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for wikitide.com could not be found [23:11:19] PROBLEM - data.lophocmatngu.wiki - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['lia.ns.cloudflare.com.', 'gerald.ns.cloudflare.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:26] PROBLEM - beta.sagan4.org - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns59.domaincontrol.com.', 'ns60.domaincontrol.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:27] PROBLEM - wiki.colleimadcat.com - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['dns30.hichina.com.', 'dns29.hichina.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:30] PROBLEM - www.polandballwiki.com - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['john.ns.cloudflare.com.', 'vida.ns.cloudflare.com.'], 'CNAME': 'mw-lb.wikitide.org.'} [23:11:32] PROBLEM - mason.sagan4.org - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns59.domaincontrol.com.', 'ns60.domaincontrol.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:37] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [23:11:39] PROBLEM - www.greatamerica.wiki - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['john.ns.cloudflare.com.', 'vida.ns.cloudflare.com.'], 'CNAME': 'mw-lb.wikitide.org.'} [23:11:41] PROBLEM - alpha.sagan4.org - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns59.domaincontrol.com.', 'ns60.domaincontrol.com.'], 'CNAME': 'cf-lb.wikitide.org.'} [23:11:43] PROBLEM - projects.dmvpetridish.com - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns1.wixdns.net.', 'ns0.wixdns.net.'], 'CNAME': 'mw-lb.wikitide.org.'} [23:11:46] PROBLEM - mlrpgspeedruns.com - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - mlrpgspeedruns.com All nameservers failed to answer the query. [23:13:39] PROBLEM - test1 php-fpm on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:13:48] PROBLEM - test1 nutcracker process on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:13:51] PROBLEM - test1 conntrack_table_size on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:13:52] PROBLEM - test1 Disk Space on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:14:08] PROBLEM - test1 JobChron Service on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:14:43] PROBLEM - test1 nutcracker port on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:14:50] PROBLEM - test1 PowerDNS Recursor on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:14:51] PROBLEM - test1 NTP time on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:15:15] PROBLEM - test1 Current Load on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:15:16] PROBLEM - test1 ferm_active on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:15:18] PROBLEM - test1 poolcounter process on test1 is CRITICAL: CHECK_NRPE: Error - Could not connect to 10.0.0.107: Connection reset by peer [23:21:52] RECOVERY - cp2 HTTPS on cp2 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 1.717 second response time [23:21:57] RECOVERY - cp3 HTTPS on cp3 is OK: HTTP OK: HTTP/2 200 - 2974 bytes in 1.319 second response time [23:22:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 52% [23:22:45] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 41% [23:22:56] RECOVERY - cp5 HTTPS on cp5 is OK: HTTP OK: HTTP/2 200 - 2996 bytes in 2.303 second response time [23:23:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 42% [23:24:45] PROBLEM - mw2 MediaWiki Rendering on mw2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:25:15] RECOVERY - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is OK: OK - NGINX Error Rate is 35% [23:25:26] PROBLEM - mw1 MediaWiki Rendering on mw1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:25:27] RECOVERY - cp2 Varnish Backends on cp2 is OK: All 7 backends are healthy [23:26:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 65% [23:26:22] RECOVERY - cp3 Varnish Backends on cp3 is OK: All 7 backends are healthy [23:26:43] PROBLEM - mw2 MediaWiki Rendering on mw2 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.181 second response time [23:26:44] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 25% [23:27:07] RECOVERY - cp5 Varnish Backends on cp5 is OK: All 7 backends are healthy [23:27:25] PROBLEM - mw1 MediaWiki Rendering on mw1 is WARNING: HTTP WARNING: HTTP/1.1 404 Not Found - 8191 bytes in 0.126 second response time [23:30:33] PROBLEM - newcascadia.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - newcascadia.net All nameservers failed to answer the query. [23:31:40] [02WikiTideOrg/mw-config] 07Universal-Omega pushed 031 commit to 03main [+0/-0/±1] 13https://github.com/WikiTideOrg/mw-config/compare/182f6901c0d0...3cedae96a65f [23:31:42] [02WikiTideOrg/mw-config] 07Universal-Omega 033cedae9 - Use db1 [23:32:35] WikiTideOrg/mw-config - Universal-Omega the build passed. [23:33:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 59% [23:33:30] [02WikiTideOrg/mw-config] 07Universal-Omega pushed 031 commit to 03main [+0/-0/±1] 13https://github.com/WikiTideOrg/mw-config/compare/3cedae96a65f...0fefbfc99bb2 [23:33:32] [02WikiTideOrg/mw-config] 07Universal-Omega 030fefbfc - Update servers [23:34:12] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 59% [23:34:26] WikiTideOrg/mw-config - Universal-Omega the build passed. [23:34:54] [02WikiTideOrg/mw-config] 07Universal-Omega pushed 031 commit to 03main [+0/-0/±1] 13https://github.com/WikiTideOrg/mw-config/compare/0fefbfc99bb2...1688b49ce607 [23:34:57] [02WikiTideOrg/mw-config] 07Universal-Omega 031688b49 - Update servers [23:35:45] WikiTideOrg/mw-config - Universal-Omega the build passed. [23:36:27] [02WikiTideOrg/mw-config] 07Universal-Omega pushed 031 commit to 03main [+0/-0/±1] 13https://github.com/WikiTideOrg/mw-config/compare/1688b49ce607...96768d80a999 [23:36:30] [02WikiTideOrg/mw-config] 07Universal-Omega 0396768d8 - Update mem server [23:37:22] WikiTideOrg/mw-config - Universal-Omega the build passed. [23:38:45] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/ae7167b68938...5fd423a3f916 [23:38:46] [02WikiTideOrg/puppet] 07Universal-Omega 035fd423a - Fix mem and add bast1 [23:40:12] RECOVERY - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is OK: OK - NGINX Error Rate is 23% [23:40:44] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/5fd423a3f916...e7e03bf59723 [23:40:47] [02WikiTideOrg/puppet] 07Universal-Omega 03e7e03bf - Use swiftproxy1 [23:41:47] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/e7e03bf59723...e616bc1234e2 [23:41:49] [02WikiTideOrg/puppet] 07Universal-Omega 03e616bc1 - Fix IP [23:42:50] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/e616bc1234e2...cd6dc271d99d [23:42:53] [02WikiTideOrg/puppet] 07Universal-Omega 03cd6dc27 - Fix IP [23:44:18] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/cd6dc271d99d...c98ac52f0e17 [23:44:19] [02WikiTideOrg/puppet] 07Universal-Omega 03c98ac52 - Fix IP [23:45:11] [02WikiTideOrg/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/WikiTideOrg/puppet/compare/c98ac52f0e17...4ed62e68c939 [23:45:13] [02WikiTideOrg/puppet] 07Universal-Omega 034ed62e6 - Fix IP [23:47:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 63% [23:49:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is WARNING: WARNING - NGINX Error Rate is 49% [23:50:47] PROBLEM - cp5 Varnish Backends on cp5 is CRITICAL: 1 backends are down. mw1 [23:51:01] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 42% [23:51:15] PROBLEM - cp2 HTTP 4xx/5xx ERROR Rate on cp2 is CRITICAL: CRITICAL - NGINX Error Rate is 80% [23:51:45] PROBLEM - cp2 Varnish Backends on cp2 is CRITICAL: 3 backends are down. mw1 mw2 mediawiki [23:52:05] PROBLEM - cp3 Varnish Backends on cp3 is CRITICAL: 1 backends are down. mw1 [23:53:33] RECOVERY - ns1 SSH on ns1 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u2 (protocol 2.0) [23:53:55] PROBLEM - cp2 HTTPS on cp2 is CRITICAL: HTTP CRITICAL - Invalid HTTP response received from host on port 443: cURL returned 7 - Failed to connect to cp2.wikitide.net port 443 after 18 ms: Couldn't connect to server [23:54:49] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is CRITICAL: CRITICAL - NGINX Error Rate is 60% [23:55:43] PROBLEM - cp2 Puppet on cp2 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 1 minute ago with 1 failures. Failed resources (up to 3 shown): Service[nginx] [23:56:22] PROBLEM - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is WARNING: WARNING - NGINX Error Rate is 59% [23:56:44] PROBLEM - cp3 HTTP 4xx/5xx ERROR Rate on cp3 is WARNING: WARNING - NGINX Error Rate is 43% [23:58:18] RECOVERY - cp5 HTTP 4xx/5xx ERROR Rate on cp5 is OK: OK - NGINX Error Rate is 22%