[00:00:02] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7mC [00:00:04] [02miraheze/puppet] 07paladox 03b05b2ee - Update mw122.yaml [00:00:05] [02puppet] 07paladox synchronize pull request 03#2272: base::syslog: force hostname to resolve to ipv6 - 13https://git.io/JS7th [00:00:12] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7mW [00:00:13] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 27584 MB (6% inode=97%); [00:00:13] [02miraheze/puppet] 07paladox 035301d05 - Update mwtask111.yaml [00:00:15] [02puppet] 07paladox synchronize pull request 03#2272: base::syslog: force hostname to resolve to ipv6 - 13https://git.io/JS7th [00:00:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7m8 [00:00:26] [02miraheze/puppet] 07paladox 035f3937c - Update test101.yaml [00:00:28] [02puppet] 07paladox synchronize pull request 03#2272: base::syslog: force hostname to resolve to ipv6 - 13https://git.io/JS7th [00:00:33] [02puppet] 07paladox closed pull request 03#2272: base::syslog: force hostname to resolve to ipv6 - 13https://git.io/JS7th [00:00:34] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±9] 13https://git.io/JS7m4 [00:00:36] [02miraheze/puppet] 07paladox 03b716031 - base::syslog: force hostname to resolve to ipv6 (#2272) [00:00:37] [02puppet] 07paladox deleted branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:00:39] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-5 [00:01:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.51, 7.12, 7.23 [00:01:40] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7m6 [00:01:42] [02miraheze/puppet] 07paladox 035e97158 - Fix [00:02:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.68, 21.70, 19.67 [00:02:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.76, 6.79, 6.72 [00:03:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 10.03, 7.65, 7.41 [00:03:23] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.72, 7.18, 6.57 [00:03:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:03:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:03:50] PROBLEM - mw121 Puppet on mw121 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:03:53] PROBLEM - mwtask111 Puppet on mwtask111 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:03:54] PROBLEM - mw101 Puppet on mw101 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:04:04] PROBLEM - mw122 Puppet on mw122 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:04:17] PROBLEM - graylog2 Puppet on graylog2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:04:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.14, 21.02, 19.67 [00:04:22] PROBLEM - mem121 Puppet on mem121 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:04:28] PROBLEM - mw102 Puppet on mw102 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:04:45] PROBLEM - mw112 Puppet on mw112 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:05:16] PROBLEM - bast101 Puppet on bast101 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:05:17] PROBLEM - mw111 Puppet on mw111 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:05:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:05:42] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.65, 3.53, 3.16 [00:07:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.28, 6.90, 7.19 [00:07:20] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.70, 6.38, 6.42 [00:07:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:07:37] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.28, 3.39, 3.14 [00:07:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.04, 5.85, 6.71 [00:09:54] RECOVERY - mw101 Puppet on mw101 is OK: OK: Puppet is currently enabled, last run 6 seconds ago with 0 failures [00:11:02] PROBLEM - db111 Puppet on db111 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:11:14] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.23, 6.01, 6.75 [00:11:32] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7Y2 [00:11:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.12, 3.59, 3.30 [00:11:33] [02miraheze/puppet] 07paladox 0359819f0 - base::syslog: Fix defining ip-protocol [00:11:42] PROBLEM - cloud12 Puppet on cloud12 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:11:51] RECOVERY - mw121 Puppet on mw121 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [00:12:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.29, 22.39, 20.64 [00:12:19] PROBLEM - cloud10 Puppet on cloud10 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:12:40] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [00:12:59] PROBLEM - mw11 Puppet on mw11 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:13:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.55, 6.88, 6.62 [00:13:08] PROBLEM - mwtask1 Puppet on mwtask1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:13:12] [02mw-config] 07paladox closed pull request 03#4345: Log to graylog on new mw cluster - 13https://git.io/JSQha [00:13:14] [02mw-config] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbvb3 [00:13:15] [02miraheze/mw-config] 07paladox deleted branch 03paladox-patch-1 [00:13:17] [02miraheze/mw-config] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7Y1 [00:13:18] [02miraheze/mw-config] 07paladox 03bab055d - Log to graylog on new mw cluster (#4345) [00:13:29] PROBLEM - mw8 Puppet on mw8 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:13:39] !log [paladox@mwtask111] starting deploy of {'config': True} to scsvg [00:13:46] !log [paladox@mwtask111] finished deploy of {'config': True} to scsvg - SUCCESS in 7s [00:14:00] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:14:12] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:14:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:14:14] !log [paladox@test101] starting deploy of {'config': True} to skip [00:14:14] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:14:15] !log [paladox@test101] finished deploy of {'config': True} to skip - SUCCESS in 0s [00:14:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.11, 22.16, 20.75 [00:14:27] miraheze/mw-config - paladox the build passed. [00:14:28] RECOVERY - mw102 Puppet on mw102 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [00:14:36] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:14:40] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:14:45] RECOVERY - mw112 Puppet on mw112 is OK: OK: Puppet is currently enabled, last run 29 seconds ago with 0 failures [00:14:47] ... [00:14:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:14:56] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:15:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.53, 5.89, 6.65 [00:15:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.07, 6.99, 6.67 [00:15:08] PROBLEM - cloud4 Puppet on cloud4 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:15:15] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:15:17] RECOVERY - mw111 Puppet on mw111 is OK: OK: Puppet is currently enabled, last run 59 seconds ago with 0 failures [00:15:48] PROBLEM - graylog121 Current Load on graylog121 is CRITICAL: CRITICAL - load average: 2.77, 2.60, 1.37 [00:15:53] RECOVERY - mwtask111 Puppet on mwtask111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:16:04] RECOVERY - mw122 Puppet on mw122 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:16:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.84, 24.45, 21.81 [00:16:30] PROBLEM - jobchron1 Puppet on jobchron1 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:17:19] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.57, 3.24, 3.28 [00:17:43] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.40, 6.93, 6.66 [00:17:46] PROBLEM - graylog121 Current Load on graylog121 is WARNING: WARNING - load average: 0.88, 1.97, 1.29 [00:17:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7Ot [00:17:56] [02miraheze/puppet] 07paladox 03b8e4812 - Make new dbs use syslog-ng [00:17:57] [02puppet] 07paladox created branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:17:59] [02puppet] 07paladox opened pull request 03#2273: Make new dbs use syslog-ng - 13https://git.io/JS7Oq [00:18:16] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7OY [00:18:17] [02miraheze/puppet] 07paladox 03fa3e8ae - Update db111.yaml [00:18:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.59, 23.61, 21.85 [00:18:19] [02puppet] 07paladox synchronize pull request 03#2273: Make new dbs use syslog-ng - 13https://git.io/JS7Oq [00:18:23] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [00:18:29] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7OO [00:18:30] [02miraheze/puppet] 07paladox 036af13c2 - Update db121.yaml [00:18:32] [02puppet] 07paladox synchronize pull request 03#2273: Make new dbs use syslog-ng - 13https://git.io/JS7Oq [00:18:54] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:18:59] [02puppet] 07paladox closed pull request 03#2273: Make new dbs use syslog-ng - 13https://git.io/JS7Oq [00:19:00] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±3] 13https://git.io/JS7Oc [00:19:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.37, 6.59, 6.75 [00:19:02] [02miraheze/puppet] 07paladox 03cff5651 - Make new dbs use syslog-ng (#2273) [00:19:03] [02puppet] 07paladox deleted branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:19:05] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-5 [00:19:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.35, 7.52, 7.00 [00:19:08] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:19:19] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:19:22] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.246 second response time [00:19:26] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:19:29] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:19:34] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7O8 [00:19:35] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:19:36] [02miraheze/puppet] 07paladox 0300d1c98 - Make puppet server use syslog-ng [00:19:37] [02puppet] 07paladox created branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:19:39] [02puppet] 07paladox opened pull request 03#2274: Make puppet server use syslog-ng - 13https://git.io/JS7O4 [00:19:41] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.37, 6.34, 6.48 [00:19:45] RECOVERY - graylog121 Current Load on graylog121 is OK: OK - load average: 0.56, 1.50, 1.20 [00:19:46] [02puppet] 07paladox closed pull request 03#2274: Make puppet server use syslog-ng - 13https://git.io/JS7O4 [00:19:48] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-5 [00:19:49] [02puppet] 07paladox deleted branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:19:51] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7OB [00:19:52] [02miraheze/puppet] 07paladox 03e1add0f - Make puppet server use syslog-ng (#2274) [00:20:16] PROBLEM - bacula2 Puppet on bacula2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:20:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 33.52, 27.28, 23.37 [00:20:51] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.369 second response time [00:20:59] [02miraheze/mediawiki] 07Universal-Omega pushed 031 commit to 03REL1_37 [+1/-0/±1] 13https://git.io/JS7Oo [00:21:01] [02miraheze/mediawiki] 07Universal-Omega 031098044 - Install StandardDialogs [00:21:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.29, 6.57, 6.74 [00:21:02] RECOVERY - db111 Puppet on db111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:21:16] PROBLEM - cloud3 Puppet on cloud3 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:21:18] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.582 second response time [00:21:18] PROBLEM - db12 Puppet on db12 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:21:21] PROBLEM - mw13 Puppet on mw13 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:21:28] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.337 second response time [00:21:30] PROBLEM - db11 Puppet on db11 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:21:44] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7O1 [00:21:45] [02miraheze/puppet] 07paladox 03613046d - make phab121 use syslog-ng [00:21:47] PROBLEM - cp31 Puppet on cp31 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:21:50] PROBLEM - mw9 Puppet on mw9 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [00:22:29] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7O5 [00:22:29] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.45, 6.96, 6.66 [00:22:30] [02miraheze/puppet] 07paladox 03f2ce0cd - Make new mem servers use syslog-ng [00:22:31] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:31] [02puppet] 07paladox created branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:22:33] [02puppet] 07paladox opened pull request 03#2275: Make new mem servers use syslog-ng - 13https://git.io/JS7Od [00:22:34] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:40] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:22:49] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.46, 5.75, 4.69 [00:22:49] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS7OA [00:22:50] [02miraheze/puppet] 07paladox 03acb69fd - Update mem121.yaml [00:22:52] [02puppet] 07paladox synchronize pull request 03#2275: Make new mem servers use syslog-ng - 13https://git.io/JS7Od [00:22:57] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:59] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/JS7Ox [00:23:01] [02miraheze/puppet] 07paladox 03bab5fbf - Make new mem servers use syslog-ng (#2275) [00:23:02] [02puppet] 07paladox closed pull request 03#2275: Make new mem servers use syslog-ng - 13https://git.io/JS7Od [00:23:04] [02puppet] 07paladox deleted branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:23:05] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-5 [00:23:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.06, 7.36, 7.01 [00:23:10] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:23:33] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS73e [00:23:33] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.67, 7.46, 6.93 [00:23:35] [02miraheze/puppet] 07paladox 031a7de80 - Make ldap111 use syslog-ng [00:23:36] [02miraheze/mediawiki] 07Universal-Omega pushed 031 commit to 03REL1_37 [+1/-0/±1] 13https://git.io/JS73v [00:23:38] [02miraheze/mediawiki] 07Universal-Omega 03bf7f75c - Install Patroller [00:23:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS73T [00:24:00] [02miraheze/puppet] 07paladox 039dcf82c - Make jobchron121 use syslog-ng [00:24:08] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.29, 3.71, 3.51 [00:24:12] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:24:27] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.50, 6.16, 6.41 [00:24:29] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.202 second response time [00:24:37] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 6.737 second response time [00:24:45] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 6.602 second response time [00:24:50] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 0s [00:25:02] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.14, 7.47, 7.11 [00:25:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.71, 7.20, 6.98 [00:25:14] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS733 [00:25:16] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:25:16] [02miraheze/puppet] 07paladox 030582cf3 - Make mail121 use syslog-ng [00:25:31] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.45, 8.13, 7.24 [00:25:40] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.011 second response time [00:25:46] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.253 second response time [00:25:52] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS73C [00:25:54] [02miraheze/puppet] 07paladox 03829f305 - Make new gluster servers use syslog-ng [00:25:55] [02puppet] 07paladox created branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:25:57] [02puppet] 07paladox opened pull request 03#2276: Make new gluster servers use syslog-ng - 13https://git.io/JS73W [00:26:15] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS738 [00:26:17] [02miraheze/puppet] 07paladox 03a6ebf28 - Update gluster111.yaml [00:26:18] [02puppet] 07paladox synchronize pull request 03#2276: Make new gluster servers use syslog-ng - 13https://git.io/JS73W [00:26:22] RECOVERY - mem121 Puppet on mem121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:26:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JS73B [00:26:28] [02miraheze/puppet] 07paladox 036bacc45 - Update gluster121.yaml [00:26:29] [02puppet] 07paladox synchronize pull request 03#2276: Make new gluster servers use syslog-ng - 13https://git.io/JS73W [00:26:32] [02puppet] 07paladox closed pull request 03#2276: Make new gluster servers use syslog-ng - 13https://git.io/JS73W [00:26:34] [02puppet] 07paladox deleted branch 03paladox-patch-5 - 13https://git.io/vbiAS [00:26:36] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±3] 13https://git.io/JS73R [00:26:37] [02miraheze/puppet] 07paladox 03f2f206e - Make new gluster servers use syslog-ng (#2276) [00:26:38] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-5 [00:26:39] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:26:43] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:26:44] [02puppet] 07paladox closed pull request 03#2271: dns: Support listening on ipv6 and update some other dns related stuff - 13https://git.io/JS7T7 [00:26:46] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:26:47] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [00:27:00] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.339 second response time [00:27:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.89, 8.03, 7.35 [00:27:05] PROBLEM - cp31 Varnish Backends on cp31 is CRITICAL: 1 backends are down. mw9 [00:27:05] [02miraheze/mediawiki] 07Universal-Omega pushed 031 commit to 03REL1_37 [+1/-0/±1] 13https://git.io/JS732 [00:27:06] [02miraheze/mediawiki] 07Universal-Omega 038c61ff9 - Install StructuredNavigation [00:27:09] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:27:11] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:27:12] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.327 second response time [00:27:13] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.348 second response time [00:27:20] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:27:26] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 1 backends are down. mw9 [00:27:28] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.12, 7.14, 6.97 [00:27:53] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:28:00] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.38, 6.85, 6.56 [00:28:00] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:28:08] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:28:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 18.57, 22.87, 23.05 [00:28:21] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:28:21] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.10, 7.38, 6.83 [00:28:22] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:28:30] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:28:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS73D [00:28:39] [02miraheze/puppet] 07paladox 0334bf0c9 - Make mon111 use syslog-ng [00:28:44] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.03, 4.50, 4.50 [00:29:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.87, 6.85, 6.98 [00:29:02] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.96, 6.86, 6.14 [00:29:03] RECOVERY - cp31 Varnish Backends on cp31 is OK: All 18 backends are healthy [00:29:04] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.017 second response time [00:29:06] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.433 second response time [00:29:18] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.714 second response time [00:29:26] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 18 backends are healthy [00:29:27] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JS737 [00:29:29] [02miraheze/puppet] 07paladox 0308e6985 - Make bastions use syslog-ng [00:29:30] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:29:32] [02puppet] 07paladox opened pull request 03#2277: Make bastions use syslog-ng - 13https://git.io/JS735 [00:29:44] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JS73d [00:29:45] [02miraheze/puppet] 07paladox 0348643f1 - Update bast121.yaml [00:29:47] [02puppet] 07paladox synchronize pull request 03#2277: Make bastions use syslog-ng - 13https://git.io/JS735 [00:29:51] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.399 second response time [00:29:53] [02puppet] 07paladox closed pull request 03#2277: Make bastions use syslog-ng - 13https://git.io/JS735 [00:29:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/JS73F [00:29:56] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.28, 3.73, 3.55 [00:29:56] [02miraheze/puppet] 07paladox 032e376c2 - Make bastions use syslog-ng (#2277) [00:29:57] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:29:59] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [00:30:00] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.143 second response time [00:30:16] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JS73h [00:30:17] [02miraheze/puppet] 07paladox 036025b84 - Make es cluster use syslog-ng [00:30:17] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.008 second response time [00:30:19] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:30:19] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.27, 7.01, 6.77 [00:30:20] [02puppet] 07paladox opened pull request 03#2278: Make es cluster use syslog-ng - 13https://git.io/JS7se [00:30:20] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.357 second response time [00:30:25] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.321 second response time [00:30:32] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:30:33] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JS7sT [00:30:35] [02miraheze/puppet] 07paladox 03ca3a72f - Update es111.yaml [00:30:36] [02puppet] 07paladox synchronize pull request 03#2278: Make es cluster use syslog-ng - 13https://git.io/JS7se [00:30:44] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JS7sL [00:30:46] [02miraheze/puppet] 07paladox 03d23f270 - Update es121.yaml [00:30:47] [02puppet] 07paladox synchronize pull request 03#2278: Make es cluster use syslog-ng - 13https://git.io/JS7se [00:31:00] [02puppet] 07paladox closed pull request 03#2278: Make es cluster use syslog-ng - 13https://git.io/JS7se [00:31:01] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [00:31:03] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:31:04] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±3] 13https://git.io/JS7sY [00:31:06] [02miraheze/puppet] 07paladox 037019d20 - Make es cluster use syslog-ng (#2278) [00:31:20] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7ss [00:31:21] [02miraheze/mw-config] 07Universal-Omega 03eb63bb8 - add three extensions to extension-list [00:31:23] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.40, 6.49, 6.75 [00:31:47] !log [universalomega@test3] starting deploy of {'world': True} to skip [00:31:51] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.62, 3.74, 3.58 [00:31:52] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.54, 6.34, 6.44 [00:32:15] RECOVERY - graylog2 Puppet on graylog2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:32:15] !log [universalomega@mw11] starting deploy of {'world': True} to all [00:32:28] miraheze/mw-config - Universal-Omega the build passed. [00:32:31] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:32:52] !log [universalomega@test101] starting deploy of {'world': True, 'proxy': True} to skip [00:32:53] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:32:55] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:32:58] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.86, 7.59, 6.57 [00:33:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.78, 6.14, 6.63 [00:33:16] RECOVERY - bast101 Puppet on bast101 is OK: OK: Puppet is currently enabled, last run 22 seconds ago with 0 failures [00:33:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:33:44] !log [universalomega@mwtask111] starting deploy of {'world': True, 'proxy': True} to scsvg [00:34:05] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:34:06] ... [00:34:15] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.11, 7.47, 6.97 [00:34:15] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:34:20] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::929a/cpweb [00:34:26] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:34:45] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:34:50] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:34:55] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:34:57] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.91, 7.80, 6.78 [00:35:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.18, 6.14, 6.74 [00:35:06] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:35:12] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:35:13] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:35:16] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:35:28] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:35:33] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:35:50] !log [universalomega@test3] finished deploy of {'world': True} to skip - SUCCESS in 243s [00:35:53] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:36:06] !log [universalomega@test101] finished deploy of {'world': True, 'proxy': True} to skip - SUCCESS in 193s [00:36:09] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:36:13] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.62, 6.58, 6.69 [00:36:13] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:36:14] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.678 second response time [00:36:18] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 10.86, 15.61, 19.68 [00:36:25] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.350 second response time [00:36:39] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:36:40] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:36:47] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:36:49] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.080 second response time [00:36:52] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.663 second response time [00:36:56] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:37:06] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.010 second response time [00:37:26] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.254 second response time [00:37:32] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14571 bytes in 0.335 second response time [00:37:39] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.22, 4.18, 3.78 [00:38:08] !log [universalomega@mwtask111] finished deploy of {'world': True, 'proxy': True} to scsvg - SUCCESS in 263s [00:38:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:38:14] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:38:54] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.93, 6.53, 6.51 [00:39:35] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.77, 3.99, 3.76 [00:39:42] RECOVERY - cloud12 Puppet on cloud12 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:39:57] !log [universalomega@mw11] finished deploy of {'world': True} to all - SUCCESS in 462s [00:40:19] RECOVERY - cloud10 Puppet on cloud10 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:40:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:40:33] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:40:34] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:41:08] RECOVERY - mwtask1 Puppet on mwtask1 is OK: OK: Puppet is currently enabled, last run 14 seconds ago with 0 failures [00:42:59] RECOVERY - mw11 Puppet on mw11 is OK: OK: Puppet is currently enabled, last run 34 seconds ago with 0 failures [00:43:08] RECOVERY - cloud4 Puppet on cloud4 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [00:43:29] RECOVERY - mw8 Puppet on mw8 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:44:30] RECOVERY - jobchron1 Puppet on jobchron1 is OK: OK: Puppet is currently enabled, last run 12 seconds ago with 0 failures [00:47:34] !log [@test101] starting deploy of {'config': True} to skip [00:47:35] !log [@test101] finished deploy of {'config': True} to skip - SUCCESS in 0s [00:47:57] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:48:02] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:48:04] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:48:16] RECOVERY - bacula2 Puppet on bacula2 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [00:48:27] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.49, 6.87, 6.67 [00:49:15] RECOVERY - cloud3 Puppet on cloud3 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:49:18] RECOVERY - db12 Puppet on db12 is OK: OK: Puppet is currently enabled, last run 39 seconds ago with 0 failures [00:49:22] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:49:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:49:28] ... [00:49:30] RECOVERY - db11 Puppet on db11 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [00:49:35] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:50:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.78, 20.30, 19.51 [00:50:24] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.24, 7.44, 6.89 [00:50:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 6 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [00:51:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.99, 7.22, 6.53 [00:51:15] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:51:20] RECOVERY - mw13 Puppet on mw13 is OK: OK: Puppet is currently enabled, last run 50 seconds ago with 0 failures [00:51:31] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:51:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb [00:51:39] !log [@mwtask111] starting deploy of {'config': True} to scsvg [00:51:45] !log [@mwtask111] finished deploy of {'config': True} to scsvg - SUCCESS in 6s [00:51:46] RECOVERY - cp31 Puppet on cp31 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:51:50] RECOVERY - mw9 Puppet on mw9 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:52:06] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:52:09] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:52:16] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:52:17] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:52:20] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.83, 6.81, 6.73 [00:52:26] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:52:34] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:52:35] ... [00:52:42] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:52:51] !log [universalomega@test101] starting deploy of {'l10nupdate': True} to skip [00:53:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.20, 6.50, 6.36 [00:53:11] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 1.034 second response time [00:53:21] !log [universalomega@mwtask111] starting deploy of {'l10nupdate': True} to scsvg [00:53:29] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.007 second response time [00:54:01] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:54:04] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.006 second response time [00:54:06] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.317 second response time [00:54:10] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:54:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 17.69, 19.13, 19.24 [00:54:53] !log [universalomega@mw11] starting deploy of {'l10n': True} to all [00:55:04] !log [@test3] starting deploy of {'config': True} to skip [00:55:05] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 0s [00:55:23] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:55:29] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:55:50] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:56:08] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS7Zj [00:56:10] [02miraheze/mw-config] 07Universal-Omega 03ebbd889 - remove StructuredNavigation from extension-list [00:56:24] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:56:33] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:56:43] !log [universalomega@mw11] starting deploy of {'pull': 'config', 'config': True} to all [00:56:43] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:56:56] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:56:57] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.90, 4.16, 3.85 [00:57:09] miraheze/mw-config - Universal-Omega the build passed. [00:57:19] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:57:26] !log [universalomega@mw11] finished deploy of {'pull': 'config', 'config': True} to all - SUCCESS in 43s [00:57:28] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:57:33] !log [universalomega@mw11] starting deploy of {'l10n': True} to all [00:57:38] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [00:57:44] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:58:01] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:58:07] !log [universalomega@test101] starting deploy of {'l10n': True} to skip [00:58:19] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:58:23] !log [universalomega@test3] starting deploy of {'l10n': True} to skip [00:58:35] !log [universalomega@mwtask111] starting deploy of {'l10n': True} to scsvg [00:58:48] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:58:53] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.40, 3.58, 3.68 [00:59:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:00:06] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.13, 6.66, 6.74 [01:00:46] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.65, 7.26, 6.50 [01:00:49] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:00:56] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:01:06] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:01:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:02:29] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:02:46] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.556 second response time [01:02:48] ... [01:02:48] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.312 second response time [01:02:56] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:02:56] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.835 second response time [01:03:09] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 7.782 second response time [01:03:41] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:03:44] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:04:04] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 11.91, 8.44, 7.36 [01:04:15] !log [universalomega@test3] finished deploy of {'l10n': True} to skip - SUCCESS in 352s [01:04:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.26, 7.50, 6.59 [01:04:36] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:04:46] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:04:59] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:05:00] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:05:04] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:05:04] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.01, 7.67, 6.78 [01:05:08] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:05:10] !log [universalomega@test101] finished deploy of {'l10n': True} to skip - SUCCESS in 422s [01:05:30] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:05:30] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:05:47] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:05:48] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 7.972 second response time [01:05:49] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 7.935 second response time [01:06:06] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.430 second response time [01:06:11] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:06:15] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.715 second response time [01:06:41] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.76, 7.35, 6.93 [01:06:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:07:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.93, 7.11, 6.70 [01:07:43] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:07:44] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:07:56] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:08:37] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:08:38] ... [01:08:39] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 3.79, 6.07, 6.51 [01:08:52] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:09:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.18, 6.18, 6.39 [01:09:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [01:09:34] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 5.074 second response time [01:09:49] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:09:53] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.73, 7.08, 7.22 [01:09:58] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 3.524 second response time [01:10:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:10:33] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.58, 7.90, 7.20 [01:10:34] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.10, 3.98, 3.82 [01:11:02] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 5.745 second response time [01:11:17] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.24, 4.24, 3.61 [01:12:25] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:12:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.55, 3.58, 3.70 [01:12:50] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:13:02] PROBLEM - gluster4 Current Load on gluster4 is WARNING: WARNING - load average: 5.51, 4.30, 2.82 [01:13:02] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:13:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 12.10, 6.84, 4.62 [01:13:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [01:13:38] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 19.78, 21.64, 19.48 [01:13:46] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.44, 6.01, 6.76 [01:13:52] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:13:54] !log [universalomega@mwtask111] finished deploy of {'l10n': True} to scsvg - SUCCESS in 919s [01:14:17] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:14:21] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:14:30] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 3.70, 5.73, 6.47 [01:14:53] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.758 second response time [01:15:01] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:15:02] RECOVERY - gluster4 Current Load on gluster4 is OK: OK - load average: 4.45, 3.96, 2.85 [01:15:27] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:15:27] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:15:28] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:15:31] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 31.43, 25.79, 21.58 [01:15:45] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:15:47] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.179 second response time [01:16:01] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:16:10] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 7.117 second response time [01:16:19] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.017 second response time [01:16:59] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 1.104 second response time [01:17:07] !log [universalomega@mw11] finished deploy of {'l10n': True} to all - SUCCESS in 1174s [01:17:09] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.005 second response time [01:17:23] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.310 second response time [01:17:24] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.114 second response time [01:17:33] RECOVERY - cloud5 Current Load on cloud5 is OK: OK - load average: 17.71, 19.98, 19.30 [01:17:35] !log [@test101] starting deploy of {'config': True} to skip [01:17:35] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.650 second response time [01:17:36] !log [@test101] finished deploy of {'config': True} to skip - SUCCESS in 0s [01:17:36] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.281 second response time [01:17:37] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:17:40] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.937 second response time [01:17:43] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.022 second response time [01:18:05] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.796 second response time [01:18:20] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:18:24] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:18:25] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:18:25] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.327 second response time [01:18:26] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:18:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:19:14] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:19:20] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.99, 23.58, 21.66 [01:19:30] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.98, 7.18, 6.59 [01:20:02] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:20:21] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.28, 6.33, 6.19 [01:20:25] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.261 second response time [01:20:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.32, 2.96, 3.38 [01:21:14] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.444 second response time [01:21:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 2.98, 5.18, 4.97 [01:21:24] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.11, 7.80, 6.89 [01:21:45] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.021 second response time [01:21:54] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:22:02] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.161 second response time [01:22:12] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:22:15] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:22:20] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.347 second response time [01:22:22] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.67, 6.57, 6.42 [01:22:29] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.434 second response time [01:22:30] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.006 second response time [01:22:48] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 1 backends are down. mw13 [01:22:50] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:23:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.05, 4.74, 4.83 [01:23:20] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:23:20] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.96, 7.98, 7.07 [01:23:21] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.41, 6.58, 6.18 [01:23:30] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:23:33] ... [01:23:43] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:23:55] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:23:56] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:23:57] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:24:00] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:24:10] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:24:11] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.205 second response time [01:24:16] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.53, 6.08, 6.15 [01:24:21] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.97, 7.70, 6.84 [01:24:32] !log [@test3] starting deploy of {'config': True} to skip [01:24:33] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 0s [01:24:43] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 18 backends are healthy [01:24:48] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:24:50] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.376 second response time [01:24:53] !log [@mwtask111] starting deploy of {'config': True} to scsvg [01:24:59] !log [@mwtask111] finished deploy of {'config': True} to scsvg - SUCCESS in 6s [01:25:03] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 26.68, 22.71, 21.65 [01:25:21] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.206 second response time [01:25:46] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 4.577 second response time [01:25:48] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:25:55] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:25:56] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.594 second response time [01:25:58] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.790 second response time [01:25:59] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 7.260 second response time [01:26:01] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 6.402 second response time [01:26:15] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.351 second response time [01:26:19] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.70, 7.10, 6.72 [01:26:24] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:26:50] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:26:52] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.536 second response time [01:26:52] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:26:58] ... [01:26:58] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 18.09, 20.85, 21.10 [01:27:27] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.11, 7.23, 6.59 [01:27:46] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:27:54] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:28:01] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:28:04] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:28:08] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 10.03, 7.91, 6.84 [01:28:18] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.14, 6.16, 6.41 [01:28:20] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:28:39] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:28:41] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:29:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 11.43, 9.21, 7.80 [01:30:11] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:30:29] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [01:30:34] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [01:30:36] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14570 bytes in 0.315 second response time [01:30:38] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 0.491 second response time [01:31:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [01:31:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [01:31:57] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.494 second response time [01:32:03] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.012 second response time [01:32:07] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.012 second response time [01:32:13] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.64, 6.83, 6.72 [01:33:11] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 6.25, 6.54, 6.49 [01:33:12] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.53, 7.62, 7.07 [01:33:17] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.84, 3.58, 3.36 [01:34:11] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.55, 7.62, 7.03 [01:35:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.78, 7.53, 7.31 [01:36:10] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.67, 6.64, 6.72 [01:37:03] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.51, 8.58, 7.54 [01:37:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.27, 7.56, 7.69 [01:38:10] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.10, 7.52, 6.91 [01:38:24] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:38:39] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:39:09] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:39:09] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.68, 8.55, 8.04 [01:39:18] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:39:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [01:39:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [01:40:10] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:40:19] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:40:35] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:40:46] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:41:01] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.53, 8.15, 7.19 [01:41:01] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.75, 3.33, 3.36 [01:41:05] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:41:09] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:41:31] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:41:32] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:41:59] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:42:04] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:42:06] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:42:13] PROBLEM - db12 Disk Space on db12 is CRITICAL: DISK CRITICAL - free space: / 26699 MB (5% inode=97%); [01:42:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.83, 18.97, 20.13 [01:42:20] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.907 second response time [01:42:31] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.734 second response time [01:42:40] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.791 second response time [01:42:43] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.326 second response time [01:42:45] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.017 second response time [01:43:00] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.210 second response time [01:43:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.39, 7.88, 7.61 [01:43:09] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.011 second response time [01:43:16] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.340 second response time [01:43:17] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.054 second response time [01:43:30] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.321 second response time [01:43:30] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.026 second response time [01:43:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [01:43:52] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.005 second response time [01:43:59] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.131 second response time [01:44:01] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.344 second response time [01:44:14] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.363 second response time [01:44:56] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.19, 7.07, 7.02 [01:45:56] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.61, 7.01, 7.14 [01:46:07] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:46:55] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.59, 6.27, 6.73 [01:47:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.14, 5.17, 4.35 [01:47:44] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.65, 6.10, 6.71 [01:48:01] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.80, 7.28, 6.82 [01:48:43] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:49:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.50, 5.08, 4.43 [01:49:48] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.37, 5.97, 6.67 [01:50:00] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.17, 7.28, 6.88 [01:50:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.73, 20.59, 20.41 [01:50:59] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:51:08] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:51:16] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:51:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [01:51:45] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:51:50] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:51:59] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:52:16] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:52:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.02, 19.93, 20.19 [01:52:47] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.76, 8.00, 7.33 [01:53:45] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.634 second response time [01:53:57] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 7.523 second response time [01:53:57] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 3.76, 5.99, 6.49 [01:54:04] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 7.883 second response time [01:55:07] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 7.358 second response time [01:55:08] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 5.732 second response time [01:55:19] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 8.380 second response time [01:55:27] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 7.858 second response time [01:55:29] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.90, 7.24, 6.96 [01:56:21] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.026 second response time [01:57:27] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.64, 7.92, 7.23 [01:57:31] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:57:52] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.30, 6.80, 6.72 [01:58:01] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [01:58:20] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.042 second response time [01:58:41] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.68, 7.95, 7.60 [01:59:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.00, 7.88, 7.56 [01:59:24] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.49, 7.52, 7.19 [01:59:33] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.211 second response time [01:59:51] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.10, 5.82, 6.37 [02:00:06] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.112 second response time [02:00:32] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:01:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.80, 7.78, 7.56 [02:02:30] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 1.177 second response time [02:03:20] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.40, 8.22, 7.56 [02:04:13] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 27536 MB (6% inode=97%); [02:04:47] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:04:51] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:04:57] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:05:13] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.07, 3.51, 3.25 [02:05:18] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.61, 7.90, 7.54 [02:06:20] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:06:39] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:06:44] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.338 second response time [02:06:51] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.221 second response time [02:06:52] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.321 second response time [02:07:09] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.97, 3.74, 3.36 [02:07:18] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:07:19] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:07:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [02:08:23] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:08:46] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 9.766 second response time [02:09:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.35, 6.00, 6.79 [02:09:05] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.11, 3.97, 3.50 [02:09:12] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.614 second response time [02:09:17] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.518 second response time [02:09:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [02:11:01] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.20, 3.77, 3.48 [02:11:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 4.84, 6.47, 7.91 [02:11:10] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.44, 5.68, 6.64 [02:12:28] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.412 second response time [02:12:29] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.576 second response time [02:12:43] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:12:57] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.34, 3.89, 3.55 [02:13:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.22, 6.97, 6.49 [02:13:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::929a/cpweb [02:13:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::5ebc/cpweb [02:13:33] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:13:44] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:13:49] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:14:41] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.668 second response time [02:15:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.05, 6.65, 7.59 [02:15:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.15, 7.31, 6.68 [02:15:29] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.632 second response time [02:15:45] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 1.867 second response time [02:15:46] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.843 second response time [02:16:48] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.98, 3.67, 3.54 [02:17:01] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.63, 6.87, 6.85 [02:17:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.37, 6.64, 7.45 [02:17:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.61, 6.99, 6.66 [02:17:16] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:17:22] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:17:30] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:17:51] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:18:09] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:18:30] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03Universal-Omega-patch-3 [+0/-0/±1] 13https://git.io/JS74z [02:18:31] [02miraheze/mw-config] 07Universal-Omega 03b4a0760 - Add Bastion and ElasticSearch services to `$wgIncidentReportingServices` [02:18:33] [02mw-config] 07Universal-Omega created branch 03Universal-Omega-patch-3 - 13https://git.io/vbvb3 [02:18:34] [02mw-config] 07Universal-Omega opened pull request 03#4348: Add Bastion and ElasticSearch services to `$wgIncidentReportingServices` - 13https://git.io/JS74g [02:18:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.80, 7.51, 7.08 [02:19:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.74, 7.36, 7.62 [02:19:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [02:20:17] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.04, 6.21, 6.67 [02:20:21] miraheze/mw-config - Universal-Omega the build passed. [02:20:40] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.19, 4.06, 3.72 [02:21:05] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03Universal-Omega-patch-3 [+0/-0/±1] 13https://git.io/JS74X [02:21:06] [02miraheze/mw-config] 07Universal-Omega 039b0ae14 - Update LocalSettings.php [02:21:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.28, 6.62, 7.32 [02:21:08] [02mw-config] 07Universal-Omega synchronize pull request 03#4348: Add Bastion and ElasticSearch services to `$wgIncidentReportingServices` - 13https://git.io/JS74g [02:21:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.88, 6.33, 6.52 [02:21:25] [02mw-config] 07Universal-Omega edited pull request 03#4348: Add Bastion, ElasticSearch, and Graylog to `$wgIncidentReportingServices` - 13https://git.io/JS74g [02:21:27] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.025 second response time [02:21:31] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 7.431 second response time [02:21:38] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.493 second response time [02:22:02] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.027 second response time [02:22:05] [02mw-config] 07Universal-Omega edited pull request 03#4348: Add Bastion, ElasticSearch, and Graylog to `$wgIncidentReportingServices` - 13https://git.io/JS74g [02:22:16] miraheze/mw-config - Universal-Omega the build passed. [02:22:17] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.354 second response time [02:22:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.54, 7.24, 7.09 [02:24:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.00, 3.96, 3.80 [02:24:39] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:24:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.05, 7.67, 7.24 [02:25:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.75, 6.91, 6.78 [02:26:37] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.128 second response time [02:26:37] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.70, 6.89, 6.70 [02:26:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.53, 7.22, 7.13 [02:27:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [02:28:32] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.70, 7.80, 7.04 [02:29:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.49, 7.18, 6.86 [02:30:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.98, 4.09, 3.84 [02:32:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.14, 3.77, 3.75 [02:32:50] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [02:32:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.61, 7.85, 7.39 [02:33:07] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:33:10] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:33:30] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:34:22] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.84, 6.81, 6.29 [02:34:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [02:34:44] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:34:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.11, 7.22, 7.22 [02:35:09] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 13.35, 9.41, 8.01 [02:35:30] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 4.361 second response time [02:36:20] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 6.23, 6.72, 6.34 [02:36:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.23, 7.46, 7.29 [02:37:38] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:37:39] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 6 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [02:38:05] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:38:05] PROBLEM - cp20 Stunnel Http for mw8 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:38:13] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:39:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.37, 7.71, 7.67 [02:39:35] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.316 second response time [02:39:45] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [02:40:01] RECOVERY - cp20 Stunnel Http for mw8 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.742 second response time [02:40:01] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 0.853 second response time [02:40:08] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.249 second response time [02:40:51] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.722 second response time [02:41:17] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 0.422 second response time [02:41:24] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.333 second response time [02:41:25] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.016 second response time [02:41:31] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [02:41:43] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.472 second response time [02:41:59] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.27, 7.79, 7.90 [02:42:12] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [02:42:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.96, 7.94, 7.58 [02:46:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.15, 3.41, 3.47 [02:48:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.77, 3.08, 3.34 [02:48:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.06, 7.29, 7.27 [02:49:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.74, 7.38, 7.80 [02:49:08] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.97, 6.85, 6.39 [02:49:40] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.67, 7.99, 7.85 [02:49:59] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 51.195.220.68/cpweb [02:50:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.18, 6.71, 7.08 [02:51:06] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.03, 6.22, 6.21 [02:51:35] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.30, 7.18, 7.59 [02:52:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.37, 3.76, 3.57 [02:53:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.68, 7.41, 7.66 [02:53:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.70, 6.36, 6.80 [02:53:52] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [02:54:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.51, 3.26, 3.41 [02:54:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.35, 7.13, 7.12 [02:55:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.57, 7.17, 7.56 [02:55:25] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.89, 8.32, 7.88 [02:56:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.68, 6.56, 6.91 [02:57:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.25, 7.58, 7.65 [02:57:20] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.77, 7.77, 7.71 [02:58:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.02, 3.92, 3.63 [02:58:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.63, 6.35, 6.80 [02:59:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.91, 6.92, 7.40 [03:00:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.79, 3.69, 3.57 [03:02:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.72, 4.10, 3.73 [03:02:40] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:02:56] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.87, 7.07, 6.40 [03:03:06] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.58, 5.59, 6.73 [03:03:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.41, 5.60, 6.74 [03:03:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.97, 7.14, 6.38 [03:04:06] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.63, 7.80, 7.16 [03:04:54] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.93, 6.29, 6.18 [03:05:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.01, 6.90, 6.38 [03:06:03] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.36, 7.76, 7.22 [03:06:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.65, 3.74, 3.70 [03:06:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:07:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.82, 6.65, 6.89 [03:07:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.42, 6.04, 6.13 [03:09:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.27, 7.54, 7.19 [03:09:55] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 3.40, 5.78, 6.56 [03:11:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.73, 7.02, 7.03 [03:18:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.30, 3.95, 3.76 [03:18:43] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.97, 7.15, 6.57 [03:19:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.53, 7.68, 7.22 [03:20:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.45, 3.79, 3.72 [03:20:41] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 11.30, 8.60, 7.17 [03:21:24] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.62, 7.72, 7.12 [03:21:42] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [03:22:05] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [03:22:33] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [03:22:38] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.94, 7.91, 7.07 [03:23:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.76, 7.58, 7.29 [03:23:36] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [03:24:01] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [03:24:29] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.315 second response time [03:24:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.46, 3.95, 3.79 [03:26:33] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.27, 7.71, 7.11 [03:27:10] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.87, 6.46, 6.74 [03:28:31] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.23, 7.33, 7.06 [03:30:28] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.46, 6.43, 6.76 [03:30:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.51, 3.86, 3.84 [03:31:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.25, 6.29, 6.77 [03:32:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.89, 6.92, 6.12 [03:34:33] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.68, 6.71, 6.16 [03:35:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.24, 7.01, 6.90 [03:37:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.28, 6.27, 6.64 [03:42:10] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.96, 6.92, 6.64 [03:42:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.85, 4.13, 3.93 [03:42:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [03:43:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.93, 7.30, 6.95 [03:43:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.21, 6.71, 6.19 [03:43:24] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.32, 7.54, 6.67 [03:44:07] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.79, 7.34, 6.76 [03:44:08] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.39, 7.65, 6.93 [03:44:29] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:44:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.24, 3.99, 3.92 [03:45:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.02, 7.85, 7.19 [03:45:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 6.11, 6.21, 6.06 [03:46:05] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.45, 6.95, 6.78 [03:46:13] PROBLEM - db12 Disk Space on db12 is CRITICAL: DISK CRITICAL - free space: / 26708 MB (5% inode=97%); [03:46:29] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 0.642 second response time [03:46:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.96, 4.31, 4.04 [03:47:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 4.94, 7.26, 7.09 [03:48:03] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.88, 6.07, 6.46 [03:48:46] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:49:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.43, 7.72, 6.70 [03:49:21] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.49, 7.86, 7.17 [03:49:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [03:50:02] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.10, 8.14, 7.30 [03:50:05] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [03:50:05] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [03:50:17] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [03:50:41] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [03:51:09] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 10.42, 7.97, 7.32 [03:51:19] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 7.43, 8.03, 7.33 [03:51:49] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:51:59] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.006 second response time [03:52:00] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.307 second response time [03:52:15] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.550 second response time [03:52:37] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.220 second response time [03:52:39] !log [universalomega@test101] finished deploy of {'l10nupdate': True} to skip - SUCCESS in 10787s [03:52:49] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:52:52] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.613 second response time [03:53:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 4.52, 6.81, 6.99 [03:53:17] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.04, 7.56, 7.26 [03:53:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [03:53:49] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.536 second response time [03:55:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 3.74, 6.07, 6.71 [03:57:52] PROBLEM - inourownwords.online - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'inourownwords.online' expires in 15 day(s) (Sun 23 Jan 2022 03:54:50 GMT +0000). [03:57:53] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.92, 7.12, 7.24 [03:58:02] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [03:58:13] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS70z [03:58:14] [02miraheze/ssl] 07MirahezeSSLBot 03ff8e62b - Bot: Update SSL cert for inourownwords.online [03:58:19] PROBLEM - inourownwords.online - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'inourownwords.online' expires in 15 day(s) (Sun 23 Jan 2022 03:54:50 GMT +0000). [03:58:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.57, 3.49, 3.88 [03:59:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.13, 6.76, 7.02 [03:59:42] !log [universalomega@mwtask111] finished deploy of {'l10nupdate': True} to scsvg - SUCCESS in 11180s [03:59:51] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.80, 7.75, 7.45 [04:00:06] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [04:00:09] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [04:00:14] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [04:00:21] PROBLEM - wiki.elgeis.com - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.elgeis.com' expires in 15 day(s) (Sun 23 Jan 2022 03:57:20 GMT +0000). [04:00:58] PROBLEM - wiki.astralprojections.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.astralprojections.org' expires in 15 day(s) (Sun 23 Jan 2022 03:55:51 GMT +0000). [04:01:01] PROBLEM - wiki.astralprojections.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.astralprojections.org' expires in 15 day(s) (Sun 23 Jan 2022 03:55:51 GMT +0000). [04:01:29] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS70K [04:01:31] [02miraheze/ssl] 07MirahezeSSLBot 03426ad53 - Bot: Update SSL cert for wiki.astralprojections.org [04:02:08] PROBLEM - wiki.elgeis.com - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.elgeis.com' expires in 15 day(s) (Sun 23 Jan 2022 03:57:20 GMT +0000). [04:03:10] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.84, 5.91, 6.65 [04:03:14] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.61, 5.95, 6.65 [04:03:27] PROBLEM - wiki.thedev.gq - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.thedev.gq' expires in 15 day(s) (Sun 23 Jan 2022 03:58:23 GMT +0000). [04:03:47] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.80, 6.97, 7.24 [04:05:58] PROBLEM - wiki.thedev.gq - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.thedev.gq' expires in 15 day(s) (Sun 23 Jan 2022 03:58:23 GMT +0000). [04:06:37] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS70y [04:06:39] [02miraheze/ssl] 07MirahezeSSLBot 0383b3702 - Bot: Update SSL cert for wiki.elgeis.com [04:09:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.06, 5.85, 6.64 [04:10:47] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JS70h [04:10:49] [02miraheze/ssl] 07MirahezeSSLBot 03b4d9257 - Bot: Update SSL cert for wiki.thedev.gq [04:16:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [04:16:31] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [04:16:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.19, 3.87, 3.70 [04:17:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.04, 7.51, 6.76 [04:18:24] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [04:18:27] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [04:18:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.36, 3.64, 3.64 [04:19:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.98, 7.99, 7.02 [04:19:07] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:19:12] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.29, 7.60, 6.83 [04:19:18] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.023 second response time [04:19:23] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:19:26] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:19:52] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [04:21:09] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.50, 7.60, 6.92 [04:21:10] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.444 second response time [04:21:22] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 4.838 second response time [04:21:26] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.496 second response time [04:21:28] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.277 second response time [04:21:50] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.476 second response time [04:22:21] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 51.195.220.68/cpweb, 2607:5300:201:3100::5ebc/cpweb [04:22:29] RECOVERY - wiki.elgeis.com - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.elgeis.com' will expire on Thu 07 Apr 2022 03:06:32 GMT +0000. [04:22:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.76, 4.26, 3.85 [04:23:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.74, 7.81, 7.16 [04:23:31] RECOVERY - wiki.thedev.gq - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.thedev.gq' will expire on Thu 07 Apr 2022 03:10:42 GMT +0000. [04:24:41] RECOVERY - inourownwords.online - LetsEncrypt on sslhost is OK: OK - Certificate 'inourownwords.online' will expire on Thu 07 Apr 2022 02:58:08 GMT +0000. [04:25:05] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.26, 6.75, 6.73 [04:25:36] RECOVERY - inourownwords.online - LetsEncrypt on sslhost is OK: OK - Certificate 'inourownwords.online' will expire on Thu 07 Apr 2022 02:58:08 GMT +0000. [04:26:14] RECOVERY - wiki.thedev.gq - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.thedev.gq' will expire on Thu 07 Apr 2022 03:10:42 GMT +0000. [04:27:19] RECOVERY - wiki.elgeis.com - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.elgeis.com' will expire on Thu 07 Apr 2022 03:06:32 GMT +0000. [04:27:37] RECOVERY - wiki.astralprojections.org - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.astralprojections.org' will expire on Thu 07 Apr 2022 03:01:23 GMT +0000. [04:28:10] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [04:28:13] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 27454 MB (6% inode=97%); [04:28:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.22, 3.68, 3.73 [04:28:54] RECOVERY - wiki.astralprojections.org - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.astralprojections.org' will expire on Thu 07 Apr 2022 03:01:23 GMT +0000. [04:28:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.04, 7.12, 6.90 [04:30:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.56, 6.08, 6.54 [04:31:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.85, 6.39, 6.74 [04:34:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.02, 3.49, 3.61 [04:36:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.70, 3.56, 3.62 [04:44:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 1.93, 2.93, 3.31 [04:52:42] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.68, 7.37, 6.51 [04:53:28] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.40, 6.78, 6.50 [04:54:40] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.97, 7.65, 6.72 [04:55:22] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.09, 6.68, 6.50 [04:56:02] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.69, 6.60, 5.90 [04:58:00] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.44, 6.75, 6.04 [04:59:57] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.96, 6.56, 6.06 [05:00:32] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.91, 6.33, 6.47 [05:01:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:02:20] PROBLEM - cp31 Varnish Backends on cp31 is CRITICAL: 1 backends are down. mw121 [05:02:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.69, 3.53, 3.34 [05:03:51] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.40, 7.26, 6.48 [05:04:18] RECOVERY - cp31 Varnish Backends on cp31 is OK: All 18 backends are healthy [05:04:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.60, 3.85, 3.48 [05:05:49] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.45, 7.41, 6.61 [05:08:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:09:44] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.88, 7.62, 6.94 [05:11:47] 8:59 PM  !log [universalomega@mwtask111] finished deploy of {'l10nupdate': True} to scsvg - SUCCESS in 11180s [05:11:53] 3 hours.... [05:12:12] That is 2.5 hours longer than the current servers.... [05:12:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.62, 3.79, 3.69 [05:13:01] paladox: ^ do you have any ideas why it would take so long on SCSVG but not ovlon? [05:13:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:13:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.02, 6.57, 6.67 [05:13:49] It wasn't working at all but I fixed it by setting the http proxy, but now it is taking forever [05:13:58] https://github.com/miraheze/mw-config/pull/4347 [05:13:59] [url] Set LocalisationUpdate proxy for SCSVG by Universal-Omega · Pull Request #4347 · miraheze/mw-config · GitHub | github.com [05:16:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.32, 4.28, 3.88 [05:17:05] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.68, 7.03, 6.54 [05:18:10] CC Reception123: ^ [05:18:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.86, 3.84, 3.77 [05:19:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:20:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.75, 4.34, 3.95 [05:26:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.58, 3.69, 3.85 [05:30:14] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.96, 6.93, 6.50 [05:32:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.16, 3.52, 3.66 [05:34:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.38, 3.23, 3.55 [05:35:58] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.62, 6.71, 6.59 [05:36:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.38, 6.45, 6.74 [05:42:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.44, 7.06, 6.83 [05:43:35] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.19, 6.85, 6.64 [05:44:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.49, 7.33, 6.97 [05:45:30] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.69, 6.52, 6.55 [05:46:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.19, 6.52, 6.71 [05:50:39] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.87, 6.84, 6.29 [05:51:14] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.71, 7.15, 6.76 [05:51:39] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [05:51:43] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb [05:52:34] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.79, 6.57, 6.26 [05:52:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.52, 7.11, 6.92 [05:53:34] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.009 second response time [05:53:39] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [05:54:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.82, 3.22, 3.37 [05:55:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.83, 6.78, 6.73 [05:58:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.49, 4.11, 3.71 [05:59:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.69, 7.59, 7.08 [06:00:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.29, 3.87, 3.68 [06:00:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.99, 7.49, 7.09 [06:02:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.38, 6.67, 6.84 [06:03:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.32, 6.10, 6.62 [06:04:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.35, 6.07, 6.59 [06:10:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.24, 3.09, 3.36 [06:19:51] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.41, 6.39, 6.23 [06:20:13] PROBLEM - db12 Disk Space on db12 is CRITICAL: DISK CRITICAL - free space: / 26702 MB (5% inode=97%); [06:21:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.80, 7.37, 6.18 [06:21:45] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.09, 5.90, 6.08 [06:21:57] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [06:22:09] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [06:22:18] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 149.56.140.43/cpweb [06:22:20] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb [06:23:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.45, 7.28, 6.32 [06:23:52] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.320 second response time [06:24:06] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.129 second response time [06:24:11] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [06:24:16] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [06:25:03] [02puppet] 07Universal-Omega opened pull request 03#2279: Revert all OOM mitigations - 13https://git.io/JS7Nr [06:25:14] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.70, 6.41, 6.12 [06:34:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [06:37:19] [02puppet] 07Universal-Omega edited pull request 03#2279: Revert all OOM mitigations - 13https://git.io/JS7Nr [06:45:57] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.86, 3.43, 3.27 [06:46:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [06:47:52] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.80, 3.80, 3.42 [06:48:20] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.96, 21.00, 19.73 [06:49:48] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.01, 3.47, 3.34 [06:51:44] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.71, 3.95, 3.53 [06:51:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.12, 6.81, 6.04 [06:53:40] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.93, 3.71, 3.48 [06:55:53] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.80, 7.01, 6.31 [06:57:31] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.10, 3.67, 3.51 [06:57:50] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.70, 6.74, 6.31 [06:59:27] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.92, 3.34, 3.40 [07:00:18] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 18.72, 19.84, 20.11 [07:03:21] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.27, 3.44, 3.44 [07:03:50] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.22, 6.50, 5.65 [07:05:17] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.59, 3.05, 3.29 [07:05:48] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 3.82, 5.40, 5.35 [07:11:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:12:13] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 27410 MB (6% inode=97%); [07:14:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:26:39] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.31, 6.96, 6.09 [07:28:33] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.90, 6.29, 5.94 [07:34:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:41:21] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.82, 21.90, 19.63 [07:41:57] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 12.85, 9.00, 7.03 [07:42:37] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:42:48] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [07:43:16] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 17.94, 20.68, 19.48 [07:44:33] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.308 second response time [07:44:45] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.041 second response time [07:45:10] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.17, 20.35, 19.50 [07:45:46] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.43, 7.54, 6.88 [07:47:53] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.63, 7.66, 6.40 [07:48:59] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.45, 20.94, 19.87 [07:51:29] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.38, 6.60, 6.75 [07:51:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:51:43] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.07, 6.44, 6.24 [07:52:48] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.01, 19.55, 19.68 [07:55:18] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.90, 7.08, 6.93 [07:57:12] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.36, 6.54, 6.76 [08:09:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.27, 6.59, 5.57 [08:11:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.63, 7.03, 5.87 [08:11:51] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.50, 6.73, 6.27 [08:13:40] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.29, 6.51, 5.93 [08:13:45] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.12, 7.09, 6.47 [08:14:18] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.68, 20.11, 19.27 [08:15:37] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.60, 6.34, 5.96 [08:15:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.26, 5.86, 5.67 [08:16:18] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 17.83, 19.32, 19.09 [08:16:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:17:34] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 3.65, 5.81, 6.14 [08:21:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.34, 7.09, 6.28 [08:22:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:23:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 51.195.220.68/cpweb, 2607:5300:201:3100::929a/cpweb [08:23:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.27, 7.40, 6.48 [08:23:43] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.17, 6.82, 5.68 [08:24:02] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.85, 7.81, 6.26 [08:25:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [08:25:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.92, 6.49, 6.27 [08:25:42] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.57, 5.96, 5.51 [08:25:59] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.43, 6.88, 6.12 [08:27:02] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.59, 20.44, 19.59 [08:27:08] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.95, 7.09, 6.64 [08:27:17] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.46, 6.84, 6.27 [08:27:55] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.50, 6.23, 5.95 [08:28:56] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.81, 19.02, 19.17 [08:29:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.19, 6.59, 6.49 [08:29:15] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.81, 6.46, 6.20 [08:37:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:40:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:43:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.06, 6.96, 5.95 [08:43:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.82, 6.42, 5.65 [08:45:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.14, 6.66, 5.97 [08:45:35] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 6.80, 6.29, 5.68 [08:50:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:55:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:59:31] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.93, 7.18, 6.33 [09:01:28] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.56, 6.44, 6.15 [09:15:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:23:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:42:13] PROBLEM - db12 Disk Space on db12 is CRITICAL: DISK CRITICAL - free space: / 26697 MB (5% inode=97%); [09:43:33] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.90, 6.31, 5.41 [09:43:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:45:28] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.37, 6.40, 5.56 [09:46:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:48:13] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 26734 MB (6% inode=97%); [09:50:13] PROBLEM - db12 Disk Space on db12 is CRITICAL: DISK CRITICAL - free space: / 26664 MB (5% inode=97%); [09:51:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:03:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.44, 6.80, 6.01 [10:05:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:07:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 3.98, 6.14, 5.99 [10:10:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:25:54] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.51, 6.24, 5.42 [10:27:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:27:51] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.80, 7.09, 5.81 [10:28:49] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.36, 7.31, 6.44 [10:29:48] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.33, 7.20, 6.01 [10:30:43] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.77, 6.95, 6.43 [10:32:38] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.14, 5.92, 6.11 [10:33:40] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 11.60, 8.49, 6.71 [10:39:33] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.99, 7.77, 6.94 [10:40:12] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 51.195.220.68/cpweb, 2607:5300:201:3100::929a/cpweb [10:40:16] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.39, 7.22, 6.41 [10:42:05] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [10:42:14] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.41, 7.49, 6.59 [10:42:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.79, 21.03, 19.59 [10:44:11] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.72, 7.28, 6.63 [10:44:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.39, 20.32, 19.51 [10:45:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.43, 7.42, 6.79 [10:45:15] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [10:45:25] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.83, 6.78, 6.22 [10:45:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 198.244.148.90/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [10:47:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.85, 6.71, 6.63 [10:47:11] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [10:47:21] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 3.91, 5.93, 5.99 [10:47:46] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [10:48:06] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.28, 6.41, 6.47 [10:49:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 6.68, 6.63, 6.57 [10:51:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.53, 7.60, 6.89 [10:52:42] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:52:47] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.61, 6.77, 6.04 [10:52:52] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:53:03] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [10:53:07] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.02, 7.01, 6.44 [10:53:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.00, 7.57, 7.00 [10:53:18] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:53:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.141.75/cpweb [10:53:39] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:53:56] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.87, 6.37, 6.51 [10:54:09] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:54:45] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.88, 5.92, 5.81 [10:56:42] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.314 second response time [10:56:53] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:56:58] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.322 second response time [10:57:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.59, 7.63, 6.80 [10:57:15] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.007 second response time [10:57:19] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:57:39] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.186 second response time [10:57:49] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.04, 7.36, 6.90 [10:57:59] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-2 [+0/-0/±1] 13https://git.io/JSbIG [10:58:01] [02miraheze/mw-config] 07RhinosF1 035b81804 - Lua: cap memoryLimit [10:58:02] [02mw-config] 07RhinosF1 created branch 03RhinosF1-patch-2 - 13https://git.io/vbvb3 [10:58:04] [02mw-config] 07RhinosF1 opened pull request 03#4349: Lua: cap memoryLimit - 13https://git.io/JSbI4 [10:58:14] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.644 second response time [10:58:49] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.290 second response time [10:58:52] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [10:59:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.77, 7.70, 6.93 [10:59:03] miraheze/mw-config - RhinosF1 the build has errored. [10:59:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.27, 6.99, 6.74 [10:59:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 3.63, 5.93, 6.54 [10:59:15] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.009 second response time [10:59:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [10:59:46] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.38, 7.44, 6.99 [11:00:13] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-2 [+0/-0/±1] 13https://git.io/JSbmE [11:00:15] [02miraheze/mw-config] 07RhinosF1 0314cce8c - Update LocalSettings.php [11:00:16] [02mw-config] 07RhinosF1 synchronize pull request 03#4349: Lua: cap memoryLimit - 13https://git.io/JSbI4 [11:00:35] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-2 [+0/-0/±1] 13https://git.io/JSbYq [11:00:37] [02miraheze/mw-config] 07RhinosF1 03dc70da4 - Update LocalSettings.php [11:00:38] [02mw-config] 07RhinosF1 synchronize pull request 03#4349: Lua: cap memoryLimit - 13https://git.io/JSbI4 [11:00:58] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-2 [+0/-0/±1] 13https://git.io/JSbY5 [11:00:59] [02miraheze/mw-config] 07RhinosF1 0306e90a6 - Update Defines.php [11:01:01] [02mw-config] 07RhinosF1 synchronize pull request 03#4349: Lua: cap memoryLimit - 13https://git.io/JSbI4 [11:01:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.37, 6.25, 6.51 [11:01:08] [02mw-config] 07RhinosF1 edited pull request 03#4349: Shell: cap memoryLimit - 13https://git.io/JSbI4 [11:01:18] miraheze/mw-config - RhinosF1 the build has errored. [11:01:36] [02mw-config] 07RhinosF1 commented on pull request 03#4349: Shell: cap memoryLimit - 13https://git.io/JSb3T [11:01:40] miraheze/mw-config - RhinosF1 the build has errored. [11:02:08] miraheze/mw-config - RhinosF1 the build passed. [11:02:56] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.79, 21.00, 20.12 [11:03:04] Reception123: if you're around, that might mitigate some loading issues [11:03:41] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.85, 7.45, 7.05 [11:04:51] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 18.12, 20.13, 19.91 [11:05:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.66, 8.25, 7.26 [11:05:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.77, 6.88, 6.77 [11:05:38] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.29, 6.80, 6.87 [11:07:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.66, 7.03, 6.93 [11:07:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.33, 6.55, 6.65 [11:07:36] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.38, 6.23, 6.64 [11:07:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [11:12:13] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 28170 MB (6% inode=97%); [11:13:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [11:15:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.01, 7.82, 7.26 [11:15:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [11:15:50] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:15:51] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.77, 7.47, 6.91 [11:15:59] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.56, 6.96, 5.91 [11:16:21] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.30, 6.84, 6.74 [11:16:41] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:16:50] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.37, 6.89, 6.52 [11:16:53] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:17:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.24, 7.66, 7.27 [11:17:05] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:17:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [11:17:45] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.734 second response time [11:17:45] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.07, 7.37, 6.96 [11:17:57] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.84, 6.62, 5.92 [11:18:18] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.99, 6.64, 6.69 [11:18:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.95, 21.28, 20.20 [11:20:43] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.15, 6.16, 6.32 [11:20:48] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 7.392 second response time [11:21:03] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.626 second response time [11:21:09] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.825 second response time [11:21:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [11:21:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [11:21:34] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.19, 6.66, 6.76 [11:27:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.13, 6.99, 6.96 [11:27:19] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.19, 6.87, 6.71 [11:29:02] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.33, 7.02, 6.97 [11:29:14] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.22, 6.70, 6.67 [11:32:18] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 20.08, 20.15, 20.25 [11:33:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.66, 6.27, 6.73 [11:35:49] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:36:11] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2001:41d0:801:2000::1b80/cpweb [11:36:24] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:36:31] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:36:48] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:36:54] RhinosF1: will look when have access. Could you please also review https://github.com/miraheze/puppet/pull/2279 [11:36:55] [url] Revert all OOM mitigations by Universal-Omega · Pull Request #2279 · miraheze/puppet · GitHub | github.com [11:37:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [11:37:45] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 2.043 second response time [11:38:21] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.207 second response time [11:38:31] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.321 second response time [11:38:38] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.25, 7.21, 6.27 [11:38:46] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.474 second response time [11:39:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.02, 6.51, 6.61 [11:39:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [11:39:59] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [11:40:37] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.87, 6.24, 6.02 [11:41:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.85, 5.70, 6.29 [11:44:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.06, 7.14, 6.51 [11:46:33] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.01, 6.35, 6.29 [11:48:28] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:48:34] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.79, 7.09, 6.64 [11:49:17] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.20, 7.79, 6.96 [11:49:30] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:49:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [11:49:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [11:50:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.20, 20.24, 19.73 [11:50:28] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.814 second response time [11:50:31] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.47, 7.26, 6.74 [11:50:36] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:50:43] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:50:48] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:50:52] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [11:51:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.77, 6.67, 6.28 [11:51:12] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.14, 7.39, 6.90 [11:51:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [11:52:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.34, 19.92, 19.68 [11:52:29] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.61, 7.14, 6.75 [11:52:33] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.160 second response time [11:52:41] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.634 second response time [11:52:46] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 3.240 second response time [11:52:51] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.859 second response time [11:53:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.62, 6.16, 6.13 [11:53:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [11:53:45] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 7.940 second response time [11:55:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.94, 6.35, 6.61 [11:56:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.26, 21.11, 20.15 [11:56:24] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 10.06, 8.35, 7.29 [11:57:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [11:59:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.18, 6.85, 6.75 [11:59:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [12:01:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.15, 6.93, 6.54 [12:01:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.92, 6.32, 6.59 [12:02:17] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.45, 7.04, 7.16 [12:02:18] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.05, 20.31, 20.20 [12:03:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.19, 6.30, 6.36 [12:06:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.07, 20.97, 20.53 [12:08:09] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.56, 5.93, 6.70 [12:08:14] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.17, 6.80, 6.06 [12:08:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 18.77, 19.88, 20.16 [12:09:08] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [12:10:10] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::5ebc/cpweb [12:10:12] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.91, 7.12, 6.26 [12:10:13] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.11, 6.62, 5.85 [12:11:49] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.63, 7.26, 6.69 [12:12:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [12:12:11] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.82, 6.01, 5.72 [12:12:55] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [12:13:44] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.19, 6.14, 6.35 [12:14:08] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.17, 6.33, 6.18 [12:18:05] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.00, 6.34, 5.90 [12:20:04] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 6.23, 6.36, 5.97 [12:23:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [12:24:46] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb [12:26:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.45, 6.48, 6.18 [12:27:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.36, 6.90, 6.35 [12:27:31] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [12:27:32] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:27:35] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:28:38] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [12:29:14] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.05, 6.20, 6.16 [12:29:29] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.056 second response time [12:29:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [12:29:35] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 4.949 second response time [12:29:37] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 4.877 second response time [12:30:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.64, 7.18, 6.52 [12:32:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [12:32:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.88, 6.69, 6.42 [12:37:05] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:37:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.03, 7.23, 6.80 [12:37:25] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:37:26] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.78, 20.80, 20.00 [12:37:35] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:37:42] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [12:39:03] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.402 second response time [12:39:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.34, 6.20, 6.48 [12:39:20] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 20.32, 20.30, 19.90 [12:39:20] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.020 second response time [12:39:32] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.012 second response time [12:39:39] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.312 second response time [12:42:14] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [12:43:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [12:45:43] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.22, 4.64, 4.29 [12:47:41] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.49, 4.13, 4.14 [12:49:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [12:50:47] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.21, 20.08, 19.82 [12:52:41] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.85, 20.16, 19.89 [12:59:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [12:59:56] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.40, 6.73, 6.33 [13:00:36] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.35, 6.88, 6.50 [13:01:50] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.41, 6.59, 6.32 [13:02:34] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.69, 6.38, 6.35 [13:02:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [13:09:16] PROBLEM - wikien.wildterra2.com - reverse DNS on sslhost is WARNING: /usr/lib/nagios/plugins/check_reverse_dns.py:101: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead resolved_ip_addr = str(dns_resolver.query(hostname, 'AAAA')[0])/usr/lib/nagios/plugins/check_reverse_dns.py:103: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead rev_host = str(dns_resolver.query(ptr_record, "PTR")[0]).rstrip('.')/u [13:09:16] /nagios/plugins/check_reverse_dns.py:66: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead nameserversans = dns_resolver.query(root_domain, 'NS')/usr/lib/nagios/plugins/check_reverse_dns.py:79: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead cname = str(dns_resolver.query(hostname, 'CNAME')[0])Traceback (most recent call last): File "/usr/lib/nagios/plugins/check_reverse_dns.py", line 148, in ain() File "/usr/lib/nagios/plugins/check_reverse_dns.py", line 129, in main records = check_records(args.hostname) File "/usr/lib/nagios/plugins/check_reverse_dns.py", line 79, in check_records cname = str(dns_resolver.query(hostname, 'CNAME')[0]) File "/usr/lib/python3/dist-packages/dns/resolver.py", line 1089, in query return self.resolve(qname, rdtype, rdclass, tcp, source, File "/usr/lib/python3/dist-packages/dns/resolver.py", line 104 [13:09:16] resolve timeout = self._compute_timeout(start, lifetime) File "/usr/lib/python3/dist-packages/dns/resolver.py", line 950, in _compute_timeout raise Timeout(timeout=duration)dns.exception.Timeout: The DNS operation timed out after 5.408281564712524 seconds [13:13:41] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.50, 21.23, 20.09 [13:15:35] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.29, 20.54, 19.96 [13:16:09] RECOVERY - wikien.wildterra2.com - reverse DNS on sslhost is OK: /usr/lib/nagios/plugins/check_reverse_dns.py:101: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead resolved_ip_addr = str(dns_resolver.query(hostname, 'AAAA')[0])/usr/lib/nagios/plugins/check_reverse_dns.py:103: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead rev_host = str(dns_resolver.query(ptr_record, "PTR")[0]).rstrip('.')/usr/l [13:16:09] ios/plugins/check_reverse_dns.py:66: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead nameserversans = dns_resolver.query(root_domain, 'NS')/usr/lib/nagios/plugins/check_reverse_dns.py:79: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead cname = str(dns_resolver.query(hostname, 'CNAME')[0])SSL OK - wikien.wildterra2.com reverse DNS resolves to cp21.miraheze.org - CNAME OK [13:19:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [13:19:50] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:19:59] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:20:04] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [13:20:06] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:20:19] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:20:26] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:20:41] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:21:07] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:21:29] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:21:32] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:21:34] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:21:47] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 2.242 second response time [13:21:57] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.059 second response time [13:22:03] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.029 second response time [13:22:03] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.32, 6.86, 6.40 [13:22:13] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:22:16] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.840 second response time [13:22:31] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 8.627 second response time [13:22:40] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.605 second response time [13:22:53] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:23:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.00, 7.26, 6.55 [13:23:40] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.521 second response time [13:23:44] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.89, 7.19, 6.21 [13:24:14] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:24:16] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:24:27] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:25:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.70, 7.39, 6.22 [13:25:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.94, 7.89, 6.86 [13:25:33] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.055 second response time [13:25:33] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.210 second response time [13:25:42] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.35, 6.82, 6.19 [13:25:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.48, 6.48, 6.38 [13:26:19] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 6.661 second response time [13:26:19] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 6.939 second response time [13:26:22] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 0.453 second response time [13:26:24] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.718 second response time [13:26:47] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.642 second response time [13:27:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 3.45, 6.02, 5.88 [13:27:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.15, 7.63, 6.90 [13:27:16] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.011 second response time [13:28:55] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.26, 6.87, 6.51 [13:29:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [13:29:45] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [13:30:51] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.03, 7.09, 6.62 [13:31:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.46, 6.50, 6.62 [13:31:42] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.88, 6.03, 6.08 [13:32:47] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.42, 6.96, 6.64 [13:34:39] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.07, 19.45, 20.12 [13:34:44] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.45, 6.50, 6.51 [13:38:28] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.56, 20.29, 20.29 [13:38:40] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.80, 7.34, 6.90 [13:40:22] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.38, 19.99, 20.19 [13:41:10] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.52, 3.05, 2.76 [13:41:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.08, 6.85, 6.46 [13:42:33] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.20, 6.58, 6.72 [13:43:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.23, 8.09, 6.85 [13:43:06] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 1.97, 2.62, 2.64 [13:43:30] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.73, 7.19, 6.66 [13:43:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.61, 5.75, 6.09 [13:43:51] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.52, 7.70, 6.88 [13:44:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.39, 21.97, 21.01 [13:45:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.56, 7.45, 6.77 [13:45:28] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.70, 7.85, 6.97 [13:45:45] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.53, 6.79, 6.64 [13:49:38] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.73, 7.17, 6.82 [13:50:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.97, 22.79, 21.50 [13:50:21] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:50:32] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:50:33] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [13:51:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.58, 6.65, 6.60 [13:51:10] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 198.244.148.90/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [13:51:20] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.82, 7.73, 7.27 [13:51:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb [13:51:32] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.20, 7.76, 7.09 [13:52:35] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 5.904 second response time [13:53:27] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.69, 7.39, 7.03 [13:54:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.87, 22.77, 21.81 [13:54:24] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.598 second response time [13:54:42] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 2.052 second response time [13:55:15] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.25, 7.82, 7.39 [13:57:13] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.76, 7.94, 7.51 [14:00:29] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:00:36] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:00:39] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:00:40] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:00:49] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:02:17] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.05, 5.14, 4.70 [14:02:30] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 3.616 second response time [14:02:34] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.323 second response time [14:02:36] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.530 second response time [14:02:40] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 0.378 second response time [14:02:45] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.008 second response time [14:02:47] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:03:05] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.27, 6.11, 6.79 [14:03:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.94, 6.36, 6.75 [14:03:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [14:04:15] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.83, 4.68, 4.58 [14:12:17] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.17, 6.53, 5.94 [14:12:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.10, 21.80, 21.52 [14:14:15] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.53, 6.88, 6.15 [14:14:18] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.91, 21.82, 21.54 [14:14:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [14:14:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.18, 3.69, 3.03 [14:16:13] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.12, 6.25, 6.01 [14:16:24] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:16:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.56, 3.53, 3.05 [14:18:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.39, 3.11, 2.95 [14:19:59] [02puppet] 07Reception123 commented on pull request 03#2279: Revert all OOM mitigations - 13https://git.io/J9qEO [14:20:06] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.88, 6.59, 6.19 [14:20:07] ^ paladox any objections? [14:20:39] No, but we are using bullseye for the new MW cluster, so if we're reverting that kind of blocks the migration to the new cluster [14:20:48] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.96, 7.43, 6.45 [14:21:11] paladox: yeah, though didn't we find out that both bullseye and buster were affected? [14:21:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb [14:21:54] well you did say there was some ooms on buster [14:22:11] yeah, very frequent ones [14:22:26] though iirc the graphs looked worse for mw8? [14:22:44] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.28, 7.02, 6.43 [14:22:52] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.17, 5.25, 4.87 [14:23:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [14:23:39] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.14, 7.47, 6.76 [14:24:01] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 6.03, 6.18, 6.12 [14:24:40] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.54, 6.45, 6.31 [14:24:46] yeh [14:25:33] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.28, 6.99, 6.64 [14:29:22] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.72, 6.67, 6.58 [14:29:34] [02puppet] 07paladox reviewed pull request 03#2279 commit - 13https://git.io/J9qEQ [14:29:35] [02puppet] 07paladox reviewed pull request 03#2279 commit - 13https://git.io/J9qE7 [14:30:04] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.141.75/cpweb [14:30:44] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.12, 4.99, 4.96 [14:31:29] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.65, 7.41, 7.16 [14:32:12] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9quv [14:32:14] [02miraheze/puppet] 07paladox 03ff02ed9 - mediawiki::php: Unset request_terminate_timeout_track_finished [14:33:12] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 10.77, 8.44, 7.29 [14:33:26] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.75, 8.20, 7.49 [14:34:18] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 30.07, 24.85, 23.12 [14:34:45] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:35:00] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:35:00] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:35:01] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:35:03] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [14:35:11] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:35:16] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:35:30] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.006 second response time [14:35:32] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:35:59] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:36:05] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:36:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 17.83, 22.67, 22.58 [14:36:21] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:36:42] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.318 second response time [14:36:53] PROBLEM - cp21 Varnish Backends on cp21 is CRITICAL: 5 backends are down. mw8 mw9 mw10 mw12 mw13 [14:36:57] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14570 bytes in 0.008 second response time [14:36:59] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.006 second response time [14:37:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.54, 7.21, 7.09 [14:37:09] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 1.627 second response time [14:37:20] PROBLEM - cp30 Varnish Backends on cp30 is CRITICAL: 1 backends are down. mw11 [14:37:21] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.91, 7.82, 7.53 [14:37:25] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 2 backends are down. mw11 mw13 [14:37:32] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 5.098 second response time [14:37:34] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 4.689 second response time [14:37:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:37:56] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.349 second response time [14:38:05] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.245 second response time [14:38:23] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 3.024 second response time [14:38:51] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [14:38:53] RECOVERY - cp21 Varnish Backends on cp21 is OK: All 18 backends are healthy [14:39:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.34, 7.89, 7.36 [14:39:07] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 1.337 second response time [14:39:08] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 23663 bytes in 1.456 second response time [14:39:19] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.87, 8.28, 7.73 [14:39:20] RECOVERY - cp30 Varnish Backends on cp30 is OK: All 18 backends are healthy [14:39:26] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 18 backends are healthy [14:41:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.25, 7.07, 7.11 [14:41:16] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.08, 7.32, 7.44 [14:42:39] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [14:43:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.38, 6.79, 6.37 [14:43:48] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.43, 3.39, 3.19 [14:44:32] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [14:44:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:45:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.99, 6.03, 6.67 [14:45:34] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:45:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 6.38, 6.45, 6.29 [14:45:43] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.67, 3.09, 3.10 [14:48:18] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 26.33, 22.82, 22.29 [14:49:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [14:50:13] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [14:50:33] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.17, 6.89, 6.56 [14:50:54] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:51:00] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:51:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.61, 7.46, 7.10 [14:51:15] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:51:16] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:51:34] [02puppet] 07RhinosF1 commented on pull request 03#2279: Revert all OOM mitigations - 13https://git.io/J9quD [14:51:35] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:52:13] [02puppet] 07RhinosF1 edited a comment on pull request 03#2279: Revert all OOM mitigations - 13https://git.io/J9quD [14:52:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.87, 22.57, 22.33 [14:52:31] paladox, Reception123: ^ [14:53:06] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.344 second response time [14:53:11] the issue CosmicAlpha is having is the db is read only thus no cache can be saved in the object table. [14:53:18] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.160 second response time [14:53:21] that being parser. So some pages will take a long time to load [14:53:21] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 7.120 second response time [14:53:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [14:53:37] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.724 second response time [14:53:41] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.09, 6.47, 6.21 [14:53:56] but also with the kind of traffic we have 12 php childs is very low. [14:54:16] i did tell you that it would impact performance too. [14:54:24] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.33, 5.82, 6.23 [14:55:03] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.314 second response time [14:55:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.43, 7.29, 6.55 [14:56:43] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:56:58] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 3.61, 5.70, 6.72 [14:57:16] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:57:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [14:59:07] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.57, 5.87, 6.54 [14:59:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.76, 5.78, 5.09 [14:59:29] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:59:33] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:59:43] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:59:47] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [14:59:49] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [14:59:59] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.44, 6.95, 6.19 [15:00:15] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:00:44] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.183 second response time [15:01:20] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.471 second response time [15:01:28] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.324 second response time [15:01:33] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.755 second response time [15:01:39] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.030 second response time [15:01:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.01, 7.52, 6.91 [15:01:45] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.804 second response time [15:01:47] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.336 second response time [15:01:57] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.29, 7.04, 6.32 [15:01:59] RhinosF1: is https://github.com/miraheze/mw-config/pull/4349 ready to merge then? [15:02:00] [url] Shell: cap memoryLimit by RhinosF1 · Pull Request #4349 · miraheze/mw-config · GitHub | github.com [15:02:06] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.61, 7.17, 6.57 [15:02:15] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.848 second response time [15:02:29] Reception123: yes [15:03:11] paladox: parser cache being empty won't help much. We can't up workers without risking OOMs unless we get more memory. [15:03:26] php 7.4 needs to be debuged [15:03:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:03:38] i keep saying that. It's got a memory leak. [15:03:53] [02mw-config] 07Reception123 closed pull request 03#4349: Shell: cap memoryLimit - 13https://git.io/JSbI4 [15:03:55] [02miraheze/mw-config] 07Reception123 pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/J9qz3 [15:03:55] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.88, 6.37, 6.17 [15:03:56] [02miraheze/mw-config] 07RhinosF1 03aa91682 - Shell: cap memoryLimit (#4349) [15:03:58] [02miraheze/mw-config] 07Reception123 deleted branch 03RhinosF1-patch-2 [15:03:59] [02mw-config] 07Reception123 deleted branch 03RhinosF1-patch-2 - 13https://git.io/vbvb3 [15:04:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.51, 6.82, 6.51 [15:04:42] paladox: with the old settings. We were risking OOMs from it trying to use what it was configured and being exhausted anyway [15:05:04] miraheze/mw-config - Reception123 the build passed. [15:05:06] It was configured to have 4x the memory allocated than was available [15:05:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.85, 4.91, 4.94 [15:05:48] you saw mw8 and you saw the other mw9. mw8 php was using a hell of a lot more ram then then mw8. [15:05:56] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.72, 6.62, 6.49 [15:06:04] Yes it's far worse on bullseye [15:06:14] *mw9 [15:06:18] But since we've only allocated memory we had [15:06:23] yes because it had A memory leak [15:06:23] It's not been an issue [15:06:30] there is a memory leak in PHP 7.4 [15:07:08] well if there were also OOMs on buster isn't the only explanation that there was also a leak on php 7.3 but somehow it's worse with 7.4? [15:07:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb [15:07:40] Reception123: as I've said 50 million times, we were allocating 4x the memory at max usage than we have [15:07:42] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 6.54, 6.68, 6.75 [15:08:24] You cant allow a process to use more memory than available then complain when it OOMs [15:09:50] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.98, 7.27, 6.88 [15:11:14] !log [@mw11] starting deploy of {'config': True} to ovlon [15:11:29] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:11:31] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:11:31] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:11:33] !log [@mw11] finished deploy of {'config': True} to ovlon - SUCCESS in 19s [15:11:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.44, 8.02, 7.26 [15:11:47] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.52, 6.46, 6.62 [15:11:49] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:11:53] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.46, 3.34, 3.21 [15:11:54] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:11:57] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:12:10] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.96, 5.23, 5.10 [15:12:11] RhinosF1: uh mw8 has 4gb of ram. You couldn't even doing 16 childs (it had to be 12 before the OOMs stopped). So whilst 32 may have been a bit too high you can compare mw8 to mw9. mw8 showed it used a hell of a lot more ram. This is a memory leak. A memory leak is when the ram usage just keeps building up until it OOMs. [15:12:39] you can look at https://grafana.miraheze.org/d/UzwQVt1mk/php-fpm-appservers?orgId=1&var-host=mw8.miraheze.org:9253&viewPanel=18 and see that there's barely any idle, which will impact performance because they are all running (childs). [15:13:01] https://grafana.miraheze.org/d/UzwQVt1mk/php-fpm-appservers?orgId=1&var-host=mw9.miraheze.org:9253&viewPanel=18 shows the same [15:13:18] mw10 and so on also show the same [15:13:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:13:33] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:13:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.25, 6.94, 6.96 [15:13:49] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.03, 2.88, 3.05 [15:14:18] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 27.62, 23.20, 21.97 [15:14:25] paladox: 32 childs * 0.5GB mem per child (as before) > 4GB ram [15:14:55] 12 * 0.5 is using 6gb of ram. [15:15:12] paladox: it's now 0.25GB [15:15:31] 3gb of ram [15:15:35] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.69, 7.29, 6.77 [15:15:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.01, 6.28, 6.71 [15:15:57] paladox: + opcache and stuff [15:16:06] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.70, 5.36, 5.15 [15:16:08] mean we fit right into 4GB from just php [15:16:17] https://grafana.miraheze.org/d/W9MIkA7iz/miraheze-cluster?orgId=1&var-job=node&var-node=mw8.miraheze.org&var-port=9100&viewPanel=78 doesn't look like it uses 3gb of ram to me. [15:16:31] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.35, 6.57, 6.28 [15:17:29] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.37, 7.08, 6.75 [15:17:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb [15:17:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:17:38] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.12, 7.20, 6.87 [15:17:43] !log [@test101] starting deploy of {'config': True} to skip [15:17:44] !log [@test101] finished deploy of {'config': True} to skip - SUCCESS in 0s [15:17:46] paladox: yes because not every request will use 100% available memory to it [15:17:48] mw8 was running perfectly with 32 according to the graph (i'm not justifying 32 i'm saying there's a leak. You are just putting in workarounds around a leak) Hell even if we set like 20 on mw8 and mw9. mw8 would oom whilst i think mw9 would still run without OOMs. [15:17:50] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:18:04] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.62, 5.33, 5.16 [15:18:11] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:18:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.97, 23.58, 22.39 [15:18:21] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:18:26] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.11, 6.24, 6.19 [15:18:35] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:18:39] paladox: but mw9-13 were OOMing, just more rarely [15:18:44] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:19:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:19:08] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:19:11] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:19:15] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:19:35] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.19, 6.76, 6.74 [15:19:36] rarely is the point and the graph wasn't all over the point like mw8 was. Secondly if we set it to 20 it wouldn't have OOM'd the times it did. [15:19:39] i'm not denying there was a leak, I'm saying that it's inevitable it would OOM. Lots of software has leaks. The WMF have them often. It doesn't take everything down every half an hour. [15:19:48] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:20:06] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:20:07] ... [15:20:15] PROBLEM - wiki.fbpml.org - reverse DNS on sslhost is CRITICAL: /usr/lib/nagios/plugins/check_reverse_dns.py:101: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead resolved_ip_addr = str(dns_resolver.query(hostname, 'AAAA')[0])rDNS CRITICAL - wiki.fbpml.org All nameservers failed to answer the query. [15:20:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.61, 24.48, 22.87 [15:20:26] can we try raising it to 16-18? [15:20:32] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.314 second response time [15:20:36] you tried 16 but with a higher max connections? [15:21:07] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.445 second response time [15:21:12] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.322 second response time [15:21:18] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.28, 6.73, 6.72 [15:21:20] paladox: SCSVG has 4 more than OVLON [15:21:34] !log [@mwtask111] starting deploy of {'config': True} to scsvg [15:21:39] yes [15:21:40] !log [@mwtask111] finished deploy of {'config': True} to scsvg - SUCCESS in 6s [15:21:43] I said to CosmicAlpha this morning I have no issue with raising it to that too on mw9-13 [15:21:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:21:48] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:21:50] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:21:56] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:21:57] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.471 second response time [15:21:58] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:21:59] as they weren't as awful at OOMing [15:22:01] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:22:13] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.568 second response time [15:22:18] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.77, 6.93, 6.52 [15:22:24] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.019 second response time [15:23:59] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.01, 5.99, 5.46 [15:24:39] alerting : [FIRING:1] (mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:24:47] !log [@test3] starting deploy of {'config': True} to skip [15:24:48] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 2s [15:25:09] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.14, 8.52, 7.48 [15:25:25] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.65, 7.47, 7.00 [15:25:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:25:51] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:26:01] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:26:03] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:26:05] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:26:10] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.43, 7.61, 6.85 [15:26:19] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:26:21] ... [15:27:20] PROBLEM - wiki.fbpml.org - reverse DNS on sslhost is WARNING: /usr/lib/nagios/plugins/check_reverse_dns.py:101: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead resolved_ip_addr = str(dns_resolver.query(hostname, 'AAAA')[0])/usr/lib/nagios/plugins/check_reverse_dns.py:103: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead rev_host = str(dns_resolver.query(ptr_record, "PTR")[0]).rstrip('.')/usr/lib/ [15:27:21] /plugins/check_reverse_dns.py:66: DeprecationWarning: please use dns.resolver.Resolver.resolve() instead nameserversans = dns_resolver.query(root_domain, 'NS')Traceback (most recent call last): File "/usr/lib/nagios/plugins/check_reverse_dns.py", line 148, in main() File "/usr/lib/nagios/plugins/check_reverse_dns.py", line 129, in main records = check_records(args.hostname) File "/usr/lib/nagios/plugins/check_reverse_dns.py", line [15:27:21] check_records nameserversans = dns_resolver.query(root_domain, 'NS') File "/usr/lib/python3/dist-packages/dns/resolver.py", line 1089, in query return self.resolve(qname, rdtype, rdclass, tcp, source, File "/usr/lib/python3/dist-packages/dns/resolver.py", line 1043, in resolve timeout = self._compute_timeout(start, lifetime) File "/usr/lib/python3/dist-packages/dns/resolver.py", line 950, in _compute_timeout raise Timeout(timeout=duration) [15:27:21] ception.Timeout: The DNS operation timed out after 5.406299591064453 seconds [15:27:23] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.34, 6.85, 6.82 [15:27:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:27:54] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.69, 5.71, 5.46 [15:28:05] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.76, 7.58, 6.96 [15:29:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.35, 7.45, 7.29 [15:29:20] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.21, 6.00, 6.52 [15:29:39] alerting : [FIRING:2] (mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:30:00] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.25, 6.67, 6.69 [15:30:18] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.64, 23.21, 23.15 [15:32:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 27.56, 24.70, 23.69 [15:33:10] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.60, 3.53, 3.20 [15:33:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:33:49] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.13, 4.36, 4.94 [15:34:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 19.00, 21.72, 22.67 [15:34:39] alerting : [FIRING:1] (mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:35:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.50, 5.92, 6.72 [15:35:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:39:21] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9qgx [15:39:22] [02miraheze/puppet] 07paladox 03dbff01a - raise php childs to 16 for old mw cluster [15:39:24] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [15:39:25] [02puppet] 07paladox opened pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:39:43] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9q2e [15:39:45] [02miraheze/puppet] 07paladox 03d26f1e8 - Update mw9.yaml [15:39:46] [02puppet] 07paladox synchronize pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:39:57] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9q2f [15:39:58] [02miraheze/puppet] 07paladox 038b54631 - Update mw10.yaml [15:40:00] [02puppet] 07paladox synchronize pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:40:12] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9q2k [15:40:13] [02miraheze/puppet] 07paladox 034631799 - Update mw11.yaml [15:40:15] [02puppet] 07paladox synchronize pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:40:22] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9q2t [15:40:24] [02miraheze/puppet] 07paladox 03f1adb13 - Update mw12.yaml [15:40:25] [02puppet] 07paladox synchronize pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:40:32] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9q2m [15:40:33] [02miraheze/puppet] 07paladox 038fe16fe - Update mw13.yaml [15:40:35] [02puppet] 07paladox synchronize pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:40:35] RhinosF1: ^ [15:40:42] can i go ahead and deploy please? :) [15:40:53] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.75, 3.37, 3.29 [15:41:12] paladox: as long as a close eye is kept on mw8 [15:41:19] ok [15:41:21] if we have to move just that back down then do [15:41:23] thanks [15:41:26] hmm multiple scribunto internal errors [15:41:30] ye aware [15:41:31] [02puppet] 07paladox closed pull request 03#2280: raise php childs to 16 for old mw cluster - 13https://git.io/J9qgp [15:41:33] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±6] 13https://git.io/J9q2O [15:41:34] [02miraheze/puppet] 07paladox 03365b597 - raise php childs to 16 for old mw cluster (#2280) [15:41:36] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [15:41:38] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [15:42:19] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03revert-4349-RhinosF1-patch-2 [+0/-0/±2] 13https://git.io/J9q2Z [15:42:20] [02miraheze/mw-config] 07RhinosF1 038aeab6e - Revert "Shell: cap memoryLimit (#4349)" [15:42:22] [02mw-config] 07RhinosF1 created branch 03revert-4349-RhinosF1-patch-2 - 13https://git.io/vbvb3 [15:42:23] [02mw-config] 07RhinosF1 opened pull request 03#4350: Revert "Shell: cap memoryLimit" - 13https://git.io/J9q2n [15:42:29] [02mw-config] 07RhinosF1 closed pull request 03#4350: Revert "Shell: cap memoryLimit" - 13https://git.io/J9q2n [15:42:30] [02mw-config] 07RhinosF1 deleted branch 03revert-4349-RhinosF1-patch-2 - 13https://git.io/vbvb3 [15:42:32] [02miraheze/mw-config] 07RhinosF1 deleted branch 03revert-4349-RhinosF1-patch-2 [15:42:33] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/J9q2C [15:42:35] [02miraheze/mw-config] 07RhinosF1 03a25a708 - Revert "Shell: cap memoryLimit (#4349)" (#4350) [15:42:50] @Lakelimbo: resolved [15:43:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.75, 6.78, 6.69 [15:43:09] nice [15:43:33] miraheze/mw-config - RhinosF1 the build passed. [15:43:34] !log [@mw11] starting deploy of {'config': True} to ovlon [15:43:40] miraheze/mw-config - RhinosF1 the build passed. [15:44:28] !log [@mw11] finished deploy of {'config': True} to ovlon - SUCCESS in 53s [15:44:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:44:47] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.82, 3.85, 3.50 [15:44:50] !log restart php-fpm on old mw cluster [15:44:58] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:45:03] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:45:04] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:45:06] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.18, 6.06, 6.43 [15:45:13] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:45:14] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:45:15] ... [15:45:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:46:04] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.54, 6.64, 6.16 [15:46:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.22, 22.73, 22.41 [15:46:44] alerting : [FIRING:1] (!sre MediaWiki Exception Rate yes mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:47:28] !log [@test101] starting deploy of {'config': True} to skip [15:47:29] !log [@test101] finished deploy of {'config': True} to skip - SUCCESS in 0s [15:48:02] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:48:04] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:48:17] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.16, 7.40, 6.76 [15:48:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.18, 21.84, 22.12 [15:49:10] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::5ebc/cpweb [15:49:16] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:49:23] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:49:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.38, 6.83, 6.26 [15:50:00] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.50, 7.54, 6.68 [15:50:02] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:50:16] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.14, 8.39, 7.08 [15:50:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.31, 22.97, 22.50 [15:50:35] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.56, 3.97, 3.74 [15:50:49] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::929a/cpweb [15:51:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:51:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.03, 7.82, 7.00 [15:51:44] alerting : [FIRING:1] (mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:51:47] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.27, 7.68, 6.90 [15:51:50] !log [@mwtask111] starting deploy of {'config': True} to scsvg [15:51:56] !log [@mwtask111] finished deploy of {'config': True} to scsvg - SUCCESS in 6s [15:51:57] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:51:58] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.80, 7.98, 6.93 [15:52:01] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:52:01] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:52:07] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.11, 7.52, 6.98 [15:52:08] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:52:12] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:52:14] ... [15:52:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.97, 22.59, 22.42 [15:52:43] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:53:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.44, 7.76, 7.09 [15:53:44] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.81, 7.41, 6.90 [15:54:31] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9q2j [15:54:33] [02miraheze/puppet] 07paladox 038c4b180 - mediawiki: Only install vmtouch on buster [15:54:36] !log [@test3] starting deploy of {'config': True} to skip [15:54:37] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 0s [15:54:57] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:55:02] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:55:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.28, 7.50, 7.05 [15:55:08] [url] Tech:Server admin log - Miraheze Meta | meta.miraheze.org [15:55:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.34, 7.67, 6.83 [15:55:48] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:56:25] <020AAL0BF> Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:56:31] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [15:56:57] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 198.244.148.90/cpweb, 2607:5300:201:3100::929a/cpweb [15:57:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.49, 7.84, 7.25 [15:57:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.55, 7.05, 6.73 [15:57:40] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 10.60, 8.73, 7.50 [15:58:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.54, 4.22, 3.92 [15:58:58] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:59:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.15, 7.86, 7.29 [15:59:35] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.78, 6.52, 6.56 [15:59:48] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.55, 8.55, 7.58 [15:59:49] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.05, 7.66, 7.37 [15:59:58] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.40, 7.35, 7.38 [16:00:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.72, 23.06, 22.56 [16:01:06] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 9.664 second response time [16:01:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.90, 7.16, 7.11 [16:01:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 8.53, 6.59, 5.41 [16:01:43] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.26, 7.44, 7.29 [16:02:12] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [16:02:46] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [16:03:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.91, 7.39, 6.87 [16:05:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.46, 7.84, 7.40 [16:05:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 3.70, 5.53, 5.29 [16:05:34] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.59, 8.24, 7.63 [16:05:47] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.58, 5.98, 6.74 [16:06:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.46, 3.57, 3.88 [16:07:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.11, 7.71, 7.41 [16:07:29] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.31, 7.58, 7.47 [16:07:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.64, 7.60, 7.06 [16:07:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.01, 6.84, 7.02 [16:08:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.13, 3.84, 3.94 [16:09:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 11.17, 8.77, 7.82 [16:09:25] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.68, 7.74, 7.74 [16:09:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.86, 7.20, 6.98 [16:09:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.95, 6.95, 7.05 [16:09:43] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.76, 6.86, 6.90 [16:10:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.32, 23.42, 23.28 [16:10:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.58, 3.88, 3.95 [16:11:19] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.58, 8.29, 7.76 [16:11:23] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.52, 8.30, 7.94 [16:11:39] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.20, 7.37, 7.09 [16:11:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 10.70, 8.43, 7.58 [16:12:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.31, 3.92, 3.95 [16:13:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.98, 4.65, 4.98 [16:13:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.45, 8.39, 7.50 [16:14:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.68, 3.83, 3.91 [16:16:03] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:16:14] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:16:20] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:16:22] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:16:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [16:16:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [16:17:24] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:17:26] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:17:29] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.80, 7.75, 7.40 [16:17:32] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:17:46] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:17:49] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:17:59] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.025 second response time [16:18:11] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.166 second response time [16:18:16] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.312 second response time [16:18:16] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.013 second response time [16:19:23] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.395 second response time [16:19:25] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.553 second response time [16:19:25] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.02, 7.65, 7.39 [16:19:27] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.033 second response time [16:19:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.49, 7.70, 7.64 [16:19:47] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.628 second response time [16:19:47] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.745 second response time [16:20:46] PROBLEM - wiki.wisrail.de - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.wisrail.de' expires in 15 day(s) (Sun 23 Jan 2022 16:18:35 GMT +0000). [16:21:21] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.66, 7.33, 7.32 [16:21:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.26, 7.70, 7.92 [16:23:18] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.55, 7.56, 7.39 [16:23:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.76, 7.52, 7.53 [16:23:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.62, 7.82, 7.91 [16:24:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 29.28, 24.78, 23.57 [16:24:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.75, 3.01, 3.39 [16:25:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.44, 7.76, 7.98 [16:25:52] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9mkR [16:25:54] [02miraheze/ssl] 07MirahezeSSLBot 038bff55b - Bot: Update SSL cert for wiki.wisrail.de [16:26:06] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.58, 5.64, 5.16 [16:26:27] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:26:34] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:26:35] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:27:02] PROBLEM - wiki.wisrail.de - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.wisrail.de' expires in 15 day(s) (Sun 23 Jan 2022 16:18:35 GMT +0000). [16:27:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 12.65, 9.40, 8.54 [16:27:16] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:28:04] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.80, 5.37, 5.13 [16:28:21] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.161 second response time [16:28:30] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.652 second response time [16:28:33] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.521 second response time [16:29:14] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:29:16] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.285 second response time [16:29:34] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:29:48] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:29:57] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:29:58] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:30:02] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.26, 4.68, 4.91 [16:31:09] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.024 second response time [16:31:30] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.708 second response time [16:31:46] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.746 second response time [16:31:56] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 0.559 second response time [16:31:57] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 23663 bytes in 0.555 second response time [16:33:53] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [16:33:56] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.95, 5.11, 5.04 [16:34:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.95, 23.49, 23.81 [16:35:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [16:36:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 4.87, 6.46, 7.78 [16:38:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.37, 7.51, 8.00 [16:39:02] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.75, 7.23, 7.87 [16:39:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.61, 7.53, 7.94 [16:40:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.90, 7.41, 7.90 [16:41:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.00, 7.88, 8.00 [16:41:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.50, 7.56, 7.99 [16:42:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 27.90, 24.13, 23.61 [16:43:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.42, 8.09, 8.12 [16:43:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [16:43:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.68, 8.10, 8.07 [16:43:42] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [16:44:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.75, 7.97, 8.02 [16:45:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [16:45:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.59, 7.82, 7.99 [16:45:39] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [16:46:44] PROBLEM - ping6 on mail121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:307) [16:46:50] PROBLEM - mem121 ferm_active on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [16:46:51] PROBLEM - db121 Current Load on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [16:46:51] PROBLEM - es121 Disk Space on es121 is CRITICAL: connect to address 2a10:6740::6:303 port 5666: No route to hostconnect to host 2a10:6740::6:303 port 5666: No route to host [16:46:51] PROBLEM - ping6 on es121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:303) [16:46:53] PROBLEM - jobchron121 APT on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:46:55] PROBLEM - mem121 Disk Space on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [16:46:55] PROBLEM - mw122 SSH on mw122 is CRITICAL: connect to address 2a10:6740::6:310 and port 22: No route to host [16:46:57] PROBLEM - cloud12 PowerDNS Recursor on cloud12 is CRITICAL: connect to address 2a10:6740::6:300 port 5666: No route to hostconnect to host 2a10:6740::6:300 port 5666: No route to host [16:46:58] PROBLEM - bast121 PowerDNS Recursor on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:00] PROBLEM - gluster121 conntrack_table_size on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:47:01] PROBLEM - gluster121 Puppet on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:47:01] PROBLEM - mw122 MediaWiki Rendering on mw122 is CRITICAL: connect to address 2a10:6740::6:310 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [16:47:01] PROBLEM - mw122 Current Load on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:47:06] PROBLEM - db121 PowerDNS Recursor on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [16:47:07] PROBLEM - mw122 Check Gluster Clients on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:47:09] PROBLEM - gluster121 ferm_active on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:47:09] PROBLEM - graylog121 APT on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:10] PROBLEM - Host cloud12 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:300) [16:47:11] PROBLEM - Host es121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:303) [16:47:12] PROBLEM - gluster121 glusterd_volume on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:47:14] PROBLEM - Host mail121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:307) [16:47:14] PROBLEM - mail121 APT on mail121 is CRITICAL: connect to address 2a10:6740::6:307 port 5666: No route to hostconnect to host 2a10:6740::6:307 port 5666: No route to host [16:47:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.24, 7.35, 7.84 [16:47:15] PROBLEM - jobchron121 Current Load on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:47:16] PROBLEM - mw122 php-fpm on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:47:16] PROBLEM - mw122 Disk Space on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:47:19] PROBLEM - graylog121 Current Load on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:19] PROBLEM - graylog121 HTTPS on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [16:47:19] PROBLEM - Host phab121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:311) [16:47:19] PROBLEM - phab121 Disk Space on phab121 is CRITICAL: connect to address 2a10:6740::6:311 port 5666: No route to hostconnect to host 2a10:6740::6:311 port 5666: No route to host [16:47:21] PROBLEM - jobchron121 Puppet on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:47:21] PROBLEM - db121 Puppet on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [16:47:21] PROBLEM - mw121 php-fpm on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [16:47:21] PROBLEM - mw121 PowerDNS Recursor on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [16:47:21] PROBLEM - bast121 NTP time on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:22] PROBLEM - graylog121 ferm_active on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:22] PROBLEM - graylog121 Puppet on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:24] PROBLEM - jobchron121 Disk Space on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:47:25] PROBLEM - mem121 NTP time on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [16:47:26] PROBLEM - bast121 Disk Space on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:28] PROBLEM - cp20 Stunnel Http for mw121 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:47:29] PROBLEM - mw121 MediaWiki Rendering on mw121 is CRITICAL: connect to address 2a10:6740::6:309 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [16:47:29] PROBLEM - mw121 JobRunner Service on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [16:47:30] PROBLEM - gluster121 APT on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:47:32] PROBLEM - graylog121 NTP time on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:34] PROBLEM - gluster121 SSH on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 and port 22: No route to host [16:47:35] PROBLEM - bast121 Puppet on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:35] PROBLEM - bast121 APT on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:36] PROBLEM - graylog121 conntrack_table_size on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:38] PROBLEM - mem121 APT on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [16:47:38] PROBLEM - mw122 ferm_active on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:47:39] PROBLEM - mw122 PowerDNS Recursor on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:47:39] PROBLEM - cp30 Varnish Backends on cp30 is CRITICAL: 2 backends are down. mw121 mw122 [16:47:39] PROBLEM - bast121 conntrack_table_size on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:39] PROBLEM - ping6 on db121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:302) [16:47:39] PROBLEM - mw121 Current Load on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [16:47:39] PROBLEM - graylog121 Disk Space on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:47:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.67, 6.79, 7.76 [16:47:42] PROBLEM - db121 Disk Space on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [16:47:42] PROBLEM - ping6 on jobchron121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:306) [16:47:42] PROBLEM - jobchron121 ferm_active on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:47:42] PROBLEM - Host jobchron121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:306) [16:47:42] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.06, 5.73, 5.39 [16:47:50] @Owen: I hope the alerts are you at the DC [16:47:53] PROBLEM - cp31 Stunnel Http for mw122 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:47:59] PROBLEM - bast121 Current Load on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:47:59] PROBLEM - cp30 Stunnel Http for mw121 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:00] PROBLEM - cp31 Varnish Backends on cp31 is CRITICAL: 2 backends are down. mw121 mw122 [16:48:03] PROBLEM - gluster121 NTP time on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:48:03] PROBLEM - gluster121 glusterd on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:48:04] PROBLEM - cp30 Stunnel Http for mw122 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:06] PROBLEM - cp21 Stunnel Http for mw122 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:06] PROBLEM - db121 conntrack_table_size on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [16:48:06] PROBLEM - Host db121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:302) [16:48:11] I’ve just arrived, nothing has been touched yet [16:48:14] PROBLEM - bast121 ferm_active on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [16:48:21] PROBLEM - ping6 on bast121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:301) [16:48:29] @Owen: everything is down [16:48:30] PROBLEM - cp31 Stunnel Http for mw121 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:30] PROBLEM - gluster121 Current Load on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:48:30] PROBLEM - gluster121 Disk Space on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [16:48:31] PROBLEM - cp21 Stunnel Http for mw121 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:33] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 2 backends are down. mw121 mw122 [16:48:34] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:35] PROBLEM - cp21 Varnish Backends on cp21 is CRITICAL: 2 backends are down. mw121 mw122 [16:48:36] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:48:50] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:49:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.34, 7.45, 7.70 [16:49:03] PROBLEM - cp20 Stunnel Http for mw122 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:49:06] That's cloud12 [16:49:08] PROBLEM - Host gluster121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:304) [16:49:09] PROBLEM - Host bast121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:301) [16:49:15] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:49:34] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::5ebc/cpweb [16:49:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.14, 7.01, 7.70 [16:49:46] Everything on cloud12 is down? Just that one? [16:49:51] cloud12 just failed ye [16:49:57] CRITICAL - Destination Unreachable (2a10:6740::6:300) [16:50:31] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.311 second response time [16:50:34] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.017 second response time [16:50:47] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.312 second response time [16:50:58] @Owen: ^ [16:51:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.70, 8.10, 7.90 [16:51:10] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.028 second response time [16:51:12] RECOVERY - Host cloud12 is UP: PING OK - Packet loss = 0%, RTA = 2.30 ms [16:51:15] RECOVERY - cloud12 PowerDNS Recursor on cloud12 is OK: DNS OK: 0.239 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::4c25 [16:51:15] PROBLEM - ping6 on cloud12 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:300) [16:51:16] RECOVERY - ping6 on cloud12 is OK: PING OK - Packet loss = 0%, RTA = 2.60 ms [16:51:25] @Owen: back [16:51:30] PROBLEM - cloud12 SMART on cloud12 is CRITICAL: connect to address 2a10:6740::6:300 port 5666: No route to hostconnect to host 2a10:6740::6:300 port 5666: No route to host [16:51:33] RECOVERY - cloud12 SMART on cloud12 is OK: OK: [cciss,0] - Device is clean --- [cciss,1] - Device is clean --- [cciss,2] - Device is clean --- [cciss,3] - Device is clean --- [cciss,4] - Device is clean --- [cciss,5] - Device is clean --- [cciss,6] - Device is clean [16:51:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 10.31, 7.89, 7.75 [16:51:39] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 3.30, 5.37, 5.40 [16:51:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.49, 7.23, 7.70 [16:52:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.57, 23.01, 23.48 [16:52:47] PROBLEM - mw112 Puppet on mw112 is CRITICAL: CRITICAL: Puppet has 3 failures. Last run 3 minutes ago with 3 failures. Failed resources (up to 3 shown): Exec[git_pull_JobRunner],Exec[git_pull_mathoid],Exec[git_pull_3d2png] [16:52:50] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.54, 3.46, 3.27 [16:53:00] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:53:02] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:53:04] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:53:12] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:53:14] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 10.25, 8.26, 7.94 [16:53:27] Will take a look my end [16:53:57] Thanks [16:54:04] I'm off out now but paladox is online [16:54:45] 16:46-51 so about 5 minutes [16:54:47] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.85, 3.33, 3.23 [16:54:59] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.347 second response time [16:54:59] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.336 second response time [16:55:01] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.012 second response time [16:55:01] https://www.irccloud.com/pastebin/DedDVKPl/ [16:55:06] cloud12 got rebooted [16:55:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.07, 7.31, 7.91 [16:55:10] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.014 second response time [16:55:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.97, 7.78, 7.81 [16:55:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.69, 7.42, 7.65 [16:55:42] [02puppet] 07Universal-Omega reviewed pull request 03#2279 commit - 13https://git.io/J9YyU [16:55:52] [02puppet] 07Universal-Omega reviewed pull request 03#2279 commit - 13https://git.io/J9YyT [16:55:55] PROBLEM - mwtask111 Puppet on mwtask111 is CRITICAL: CRITICAL: Puppet has 6 failures. Last run 3 minutes ago with 6 failures. Failed resources (up to 3 shown): Exec[git_pull_JobRunner],Exec[git_pull_MediaWiki config],Exec[git_pull_landing],Exec[git_pull_ErrorPages] [16:56:43] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.73, 3.05, 3.13 [16:56:56] paladox: why? Syslog? [16:57:00] RECOVERY - cp31 Stunnel Http for mw121 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14548 bytes in 0.322 second response time [16:57:05] RECOVERY - Host mail121 is UP: PING OK - Packet loss = 0%, RTA = 2.71 ms [16:57:05] RECOVERY - mail121 APT on mail121 is OK: APT OK: 1 packages available for upgrade (0 critical updates). [16:57:05] PROBLEM - mail121 NTP time on mail121 is CRITICAL: connect to address 2a10:6740::6:307 port 5666: No route to hostconnect to host 2a10:6740::6:307 port 5666: No route to host PROBLEM - mail121 PowerDNS Recursor on mail121 is CRITICAL: connect to address 2a10:6740::6:307 port 5666: No route to hostconnect to host 2a10:6740::6:307 port 5666: No route to host [16:57:05] PROBLEM - mail121 HTTPS on mail121 is CRITICAL: connect to address 2a10:6740::6:307 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [16:57:05] RECOVERY - Host gluster121 is UP: PING OK - Packet loss = 0%, RTA = 1.71 ms [16:57:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.07, 7.98, 8.08 [16:57:07] RECOVERY - Host bast121 is UP: PING OK - Packet loss = 0%, RTA = 1.35 ms [16:57:10] RECOVERY - bast121 PowerDNS Recursor on bast121 is OK: DNS OK: 0.209 seconds response time. miraheze.org returns 149.56.140.43,2607:5300:201:3100::5ebc,2607:5300:201:3100::929a [16:57:13] RECOVERY - bast121 Disk Space on bast121 is OK: DISK OK - free space: / 15380 MB (87% inode=93%); [16:57:16] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.79, 8.13, 7.91 [16:57:17] RECOVERY - Host es121 is UP: PING OK - Packet loss = 0%, RTA = 1.43 ms [16:57:19] RECOVERY - bast121 APT on bast121 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [16:57:19] RECOVERY - gluster121 APT on gluster121 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [16:57:20] RECOVERY - bast121 Puppet on bast121 is OK: OK: Puppet is currently enabled, last run 2 seconds ago with 0 failures [16:57:20] PROBLEM - es121 NTP time on es121 is CRITICAL: connect to address 2a10:6740::6:303 port 5666: No route to hostconnect to host 2a10:6740::6:303 port 5666: No route to host [16:57:20] RECOVERY - es121 Disk Space on es121 is OK: DISK OK - free space: / 260773 MB (97% inode=99%); [16:57:20] PROBLEM - mail121 SMTP on mail121 is CRITICAL: connect to address 2a10:6740::6:307 and port 25: No route to hostSMTP CRITICAL - 0.389 sec. response time [16:57:20] RECOVERY - ping6 on es121 is OK: PING OK - Packet loss = 0%, RTA = 2.42 ms [16:57:21] RECOVERY - mail121 SMTP on mail121 is OK: SMTP OK - 1.598 sec. response time [16:57:23] RECOVERY - gluster121 SSH on gluster121 is OK: SSH OK - OpenSSH_8.4p1 Debian-5 (protocol 2.0) [16:57:24] RECOVERY - es121 NTP time on es121 is OK: NTP OK: Offset 0.09106829762 secs [16:57:24] RECOVERY - bast121 conntrack_table_size on bast121 is OK: OK: nf_conntrack is 0 % full [16:57:29] RECOVERY - bast121 NTP time on bast121 is OK: NTP OK: Offset 0.008796036243 secs [16:57:30] RECOVERY - cp20 Stunnel Http for mw122 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14548 bytes in 0.045 second response time [16:57:35] RECOVERY - gluster121 PowerDNS Recursor on gluster121 is OK: DNS OK: 0.342 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25 [16:57:40] RECOVERY - Host graylog121 is UP: PING OK - Packet loss = 0%, RTA = 1.93 ms [16:57:40] RECOVERY - Host mem121 is UP: PING OK - Packet loss = 0%, RTA = 0.64 ms [16:57:40] RECOVERY - mem121 APT on mem121 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [16:57:40] RECOVERY - graylog121 Puppet on graylog121 is OK: OK: Puppet is currently enabled, last run 30 minutes ago with 0 failures [16:57:40] RECOVERY - graylog121 conntrack_table_size on graylog121 is OK: OK: nf_conntrack is 0 % full [16:57:40] RECOVERY - graylog121 Disk Space on graylog121 is OK: DISK OK - free space: / 5303 MB (60% inode=86%); [16:57:40] RECOVERY - graylog121 NTP time on graylog121 is OK: NTP OK: Offset -0.0007925927639 secs RECOVERY - mem121 NTP time on mem121 is OK: NTP OK: Offset 0.4226583242 secs [16:57:41] PROBLEM - graylog121 Current Load on graylog121 is WARNING: WARNING - load average: 1.72, 0.51, 0.18 [16:57:41] RECOVERY - graylog121 APT on graylog121 is OK: APT OK: 1 packages available for upgrade (0 critical updates). [16:57:42] RECOVERY - graylog121 ferm_active on graylog121 is OK: OK ferm input default policy is set [16:57:42] RECOVERY - mem121 Current Load on mem121 is OK: OK - load average: 0.77, 0.30, 0.10 [16:57:45] RECOVERY - mem121 SSH on mem121 is OK: SSH OK - OpenSSH_8.4p1 Debian-5 (protocol 2.0) [16:57:45] PROBLEM - mail121 IMAP on mail121 is CRITICAL: connect to address 2a10:6740::6:307 and port 143: No route to host [16:57:47] RECOVERY - ping6 on gluster121 is OK: PING OK - Packet loss = 0%, RTA = 1.83 ms [16:57:50] RECOVERY - bast121 SSH on bast121 is OK: SSH OK - OpenSSH_8.4p1 Debian-5 (protocol 2.0) [16:57:53] RECOVERY - Host jobchron121 is UP: PING OK - Packet loss = 0%, RTA = 1.38 ms [16:57:54] RECOVERY - cp20 Stunnel Http for mw121 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.018 second response time [16:57:54] RECOVERY - Host mw121 is UP: PING OK - Packet loss = 0%, RTA = 1.24 ms [16:57:55] RECOVERY - bast121 Current Load on bast121 is OK: OK - load average: 0.37, 0.17, 0.06 [16:57:55] RECOVERY - mw121 ferm_active on mw121 is OK: OK ferm input default policy is set [16:57:55] RECOVERY - jobchron121 Disk Space on jobchron121 is OK: DISK OK - free space: / 6275 MB (71% inode=86%); [16:57:55] RECOVERY - mw121 Current Load on mw121 is OK: OK - load average: 0.39, 0.12, 0.04 [16:57:55] PROBLEM - es121 APT on es121 is CRITICAL: connect to address 2a10:6740::6:303 port 5666: No route to hostconnect to host 2a10:6740::6:303 port 5666: No route to host [16:57:55] RECOVERY - mw121 JobRunner Service on mw121 is OK: PROCS OK: 1 process with args 'redisJobRunnerService' [16:57:56] RECOVERY - jobchron121 Puppet on jobchron121 is OK: OK: Puppet is currently enabled, last run 41 minutes ago with 0 failures PROBLEM - mw121 Check Gluster Clients on mw121 is CRITICAL: PROCS CRITICAL: 0 processes with args '/usr/sbin/glusterfs' [16:57:56] RECOVERY - mw121 PowerDNS Recursor on mw121 is OK: DNS OK: 0.282 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80 [16:57:57] RECOVERY - mw121 SSH on mw121 is OK: SSH OK - OpenSSH_8.4p1 Debian-5 (protocol 2.0) [16:57:57] RECOVERY - jobchron121 APT on jobchron121 is OK: APT OK: 0 packages available for upgrade (0 critical updates). [16:58:00] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [16:58:09] RECOVERY - bast121 ferm_active on bast121 is OK: OK ferm input default policy is set [16:58:09] RECOVERY - Host db121 is UP: PING OK - Packet loss = 0%, RTA = 1.63 ms [16:58:10] RECOVERY - db121 MariaDB on db121 is OK: Uptime: 69 Threads: 2 Questions: 1 Slow queries: 0 Opens: 16 Open tables: 10 Queries per second avg: 0.014 [16:58:10] RECOVERY - db121 Disk Space on db121 is OK: DISK OK - free space: / 324140 MB (52% inode=99%); [16:58:10] RECOVERY - db121 ferm_active on db121 is OK: OK ferm input default policy is set [16:58:10] RECOVERY - ping6 on db121 is OK: PING OK - Packet loss = 0%, RTA = 1.29 ms [16:58:10] RECOVERY - db121 conntrack_table_size on db121 is OK: OK: nf_conntrack is 0 % full [16:58:10] RECOVERY - mail121 HTTPS on mail121 is OK: HTTP OK: HTTP/1.1 301 Moved Permanently - 427 bytes in 0.020 second response time [16:58:13] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:58:13] RECOVERY - cp31 Stunnel Http for mw122 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.317 second response time [16:58:20] RECOVERY - cp30 Stunnel Http for mw121 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.323 second response time [16:58:20] PROBLEM - ping6 on graylog121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:305) [16:58:20] PROBLEM - jobchron121 NTP time on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:58:20] RECOVERY - mail121 PowerDNS Recursor on mail121 is OK: DNS OK: 1.742 second response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [16:58:22] RECOVERY - ping6 on graylog121 is OK: PING OK - Packet loss = 0%, RTA = 0.83 ms [16:58:22] RECOVERY - ping6 on bast121 is OK: PING OK - Packet loss = 0%, RTA = 0.84 ms [16:58:23] RECOVERY - jobchron121 NTP time on jobchron121 is OK: NTP OK: Offset 0.4394352138 secs [16:58:27] RECOVERY - gluster121 Current Load on gluster121 is OK: OK - load average: 0.81, 0.46, 0.18 [16:58:28] RECOVERY - gluster121 Disk Space on gluster121 is OK: DISK OK - free space: / 407949 MB (50% inode=94%); [16:58:29] RECOVERY - cp21 Varnish Backends on cp21 is OK: All 18 backends are healthy [16:58:30] PROBLEM - jobchron121 Redis Process on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [16:58:30] RECOVERY - jobchron121 Redis Process on jobchron121 is OK: PROCS OK: 1 process with args 'redis-server' [16:58:33] RECOVERY - cp21 Stunnel Http for mw122 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.017 second response time [16:58:34] RECOVERY - mail121 NTP time on mail121 is OK: NTP OK: Offset -0.002410620451 secs [16:58:35] PROBLEM - graylog121 PowerDNS Recursor on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [16:58:35] PROBLEM - mw121 NTP time on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [16:58:35] RECOVERY - graylog121 PowerDNS Recursor on graylog121 is OK: DNS OK: 0.479 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [16:58:40] RECOVERY - ping6 on mail121 is OK: PING OK - Packet loss = 0%, RTA = 1.32 ms [16:58:40] PROBLEM - mw122 NTP time on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [16:58:41] RECOVERY - mw121 NTP time on mw121 is OK: NTP OK: Offset 0.001346051693 secs [16:58:42] RECOVERY - mw122 NTP time on mw122 is OK: NTP OK: Offset -0.0004119873047 secs [16:58:43] RECOVERY - mem121 Disk Space on mem121 is OK: DISK OK - free space: / 6483 MB (73% inode=86%); [16:58:43] RECOVERY - cp30 Stunnel Http for mw122 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.336 second response time [16:58:44] RECOVERY - mem121 ferm_active on mem121 is OK: OK ferm input default policy is set [16:58:44] RECOVERY - cp21 Stunnel Http for mw121 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14548 bytes in 0.015 second response time [16:58:44] RECOVERY - mw122 MediaWiki Rendering on mw122 is OK: HTTP OK: HTTP/1.1 200 OK - 23669 bytes in 0.190 second response time [16:58:44] RECOVERY - mw122 Current Load on mw122 is OK: OK - load average: 0.65, 0.32, 0.12 [16:58:45] PROBLEM - db121 NTP time on db121 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [16:58:48] RECOVERY - gluster121 conntrack_table_size on gluster121 is OK: OK: nf_conntrack is 0 % full [16:58:49] RECOVERY - gluster121 ferm_active on gluster121 is OK: OK ferm input default policy is set [16:58:49] RECOVERY - gluster121 Puppet on gluster121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:58:50] RECOVERY - gluster121 glusterd_volume on gluster121 is OK: PROCS OK: 1 process with args '/usr/sbin/glusterfsd' [16:58:50] RECOVERY - db121 PowerDNS Recursor on db121 is OK: DNS OK: 0.033 seconds response time. miraheze.org returns 2001:41d0:801:2000::1b80,51.195.220.68 [16:58:50] RECOVERY - db121 Current Load on db121 is OK: OK - load average: 0.23, 0.13, 0.05 [16:58:51] RECOVERY - db121 NTP time on db121 is OK: NTP OK: Offset 0.0002303123474 secs [16:58:55] RECOVERY - jobchron121 Current Load on jobchron121 is OK: OK - load average: 0.35, 0.27, 0.11 [16:59:00] RECOVERY - mw122 SSH on mw122 is OK: SSH OK - OpenSSH_8.4p1 Debian-5 (protocol 2.0) [16:59:00] RECOVERY - mw122 php-fpm on mw122 is OK: PROCS OK: 17 processes with command name 'php-fpm7.4' [16:59:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.13, 7.68, 7.84 [16:59:04] RECOVERY - Host phab121 is UP: PING OK - Packet loss = 0%, RTA = 1.97 ms [16:59:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.98, 7.38, 7.86 [16:59:10] RECOVERY - phab121 Disk Space on phab121 is OK: DISK OK - free space: / 23662 MB (89% inode=94%); [16:59:10] RECOVERY - db121 Puppet on db121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [16:59:17] RECOVERY - graylog121 Current Load on graylog121 is OK: OK - load average: 1.25, 0.94, 0.39 [16:59:17] RECOVERY - graylog121 HTTPS on graylog121 is OK: HTTP OK: HTTP/1.1 200 OK - 1418 bytes in 0.321 second response time [16:59:20] RECOVERY - cp30 Varnish Backends on cp30 is OK: All 18 backends are healthy [16:59:25] nothing in syslog RhinosF1 [16:59:30] last i see is: [16:59:33] Jan 7 16:42:29 cloud12 systemd[1]: prometheus-node-exporter-apt.service: Consumed 1.368s CPU time. [16:59:50] RECOVERY - mail121 IMAP on mail121 is OK: IMAP OK - 0.012 second response time on 2a10:6740::6:307 port 143 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ STARTTLS LOGINDISABLED] Dovecot (Debian) ready.] [16:59:50] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [16:59:59] RECOVERY - cp31 Varnish Backends on cp31 is OK: All 18 backends are healthy [17:00:06] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 18 backends are healthy [17:00:14] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 5.010 second response time [17:00:50] did cloud12 go down again? [17:00:52] i cannot load [17:00:55] Cloud12 down is me [17:01:05] ah [17:01:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.41, 8.32, 8.14 [17:01:28] RECOVERY - wiki.wisrail.de - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.wisrail.de' will expire on Thu 07 Apr 2022 15:25:46 GMT +0000. [17:01:30] I can’t get its second PSU working so I need power off [17:01:44] gonna stop the replicas on db101 and 111 [17:01:53] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 5.631 second response time [17:02:15] RECOVERY - wiki.wisrail.de - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.wisrail.de' will expire on Thu 07 Apr 2022 15:25:46 GMT +0000. [17:02:25] PROBLEM - jobchron121 NTP time on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [17:02:27] PROBLEM - ping6 on graylog121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:305) [17:02:28] PROBLEM - ping6 on bast121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:301) [17:02:30] PROBLEM - db121 SSH on db121 is CRITICAL: connect to address 2a10:6740::6:302 and port 22: No route to host [17:02:30] PROBLEM - cloud12 Disk Space on cloud12 is CRITICAL: connect to address 2a10:6740::6:300 port 5666: No route to hostconnect to host 2a10:6740::6:300 port 5666: No route to host [17:02:33] PROBLEM - mw121 APT on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:02:34] PROBLEM - mem121 Puppet on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [17:02:36] PROBLEM - jobchron121 Redis Process on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [17:02:37] PROBLEM - mw122 JobRunner Service on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:02:37] PROBLEM - gluster121 Current Load on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [17:02:38] PROBLEM - mem121 memcached on mem121 is CRITICAL: connect to address 2a10:6740::6:308 and port 11211: No route to host [17:02:39] PROBLEM - jobchron121 SSH on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 and port 22: No route to host [17:02:41] PROBLEM - cloud12 conntrack_table_size on cloud12 is CRITICAL: connect to address 2a10:6740::6:300 port 5666: No route to hostconnect to host 2a10:6740::6:300 port 5666: No route to host [17:02:42] PROBLEM - gluster121 Disk Space on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [17:02:43] Network will completely go in a short while as I replace the cables for the switch [17:02:44] PROBLEM - ping6 on mail121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:307) [17:02:45] PROBLEM - mail121 NTP time on mail121 is CRITICAL: connect to address 2a10:6740::6:307 port 5666: No route to hostconnect to host 2a10:6740::6:307 port 5666: No route to host [17:02:49] PROBLEM - phab121 NTP time on phab121 is CRITICAL: connect to address 2a10:6740::6:311 port 5666: No route to hostconnect to host 2a10:6740::6:311 port 5666: No route to host [17:02:50] PROBLEM - graylog121 PowerDNS Recursor on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:02:51] PROBLEM - mw121 NTP time on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:02:52] PROBLEM - mw122 NTP time on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:02:52] PROBLEM - mem121 Disk Space on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [17:02:53] PROBLEM - mem121 ferm_active on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [17:02:54] PROBLEM - cloud12 PowerDNS Recursor on cloud12 is CRITICAL: connect to address 2a10:6740::6:300 port 5666: No route to hostconnect to host 2a10:6740::6:300 port 5666: No route to host [17:02:54] PROBLEM - bast121 PowerDNS Recursor on bast121 is CRITICAL: connect to address 2a10:6740::6:301 port 5666: No route to hostconnect to host 2a10:6740::6:301 port 5666: No route to host [17:02:54] PROBLEM - gluster121 glusterd_volume on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [17:02:54] PROBLEM - gluster121 Puppet on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [17:02:54] PROBLEM - gluster121 ferm_active on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [17:02:55] PROBLEM - db121 PowerDNS Recursor on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [17:02:55] PROBLEM - db121 NTP time on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [17:02:56] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:02:56] PROBLEM - mw122 MediaWiki Rendering on mw122 is CRITICAL: connect to address 2a10:6740::6:310 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [17:02:56] PROBLEM - ping6 on mw122 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:310) [17:02:56] PROBLEM - mw122 Current Load on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:02:57] PROBLEM - cp21 Stunnel Http for mw122 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:02:59] PROBLEM - mail121 APT on mail121 is CRITICAL: connect to address 2a10:6740::6:307 port 5666: No route to hostconnect to host 2a10:6740::6:307 port 5666: No route to host [17:02:59] PROBLEM - graylog121 APT on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:03:00] PROBLEM - gluster121 conntrack_table_size on gluster121 is CRITICAL: connect to address 2a10:6740::6:304 port 5666: No route to hostconnect to host 2a10:6740::6:304 port 5666: No route to host [17:03:00] PROBLEM - db121 Current Load on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [17:03:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.50, 8.07, 7.91 [17:03:02] PROBLEM - es121 Disk Space on es121 is CRITICAL: connect to address 2a10:6740::6:303 port 5666: No route to hostconnect to host 2a10:6740::6:303 port 5666: No route to host [17:03:03] PROBLEM - graylog121 ferm_active on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:03:04] PROBLEM - Host mail121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:307) [17:03:04] PROBLEM - Host gluster121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:304) [17:03:05] PROBLEM - Host phab121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:311) [17:03:08] PROBLEM - cp21 Stunnel Http for mw121 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:11] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.286 second response time [17:03:13] PROBLEM - cp30 Stunnel Http for mw122 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:16] PROBLEM - mw121 JobRunner Service on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:03:16] PROBLEM - mw122 Disk Space on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:03:17] PROBLEM - Host cloud12 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:300) [17:03:20] PROBLEM - graylog121 HTTPS on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [17:03:20] PROBLEM - graylog121 Current Load on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:03:21] PROBLEM - mw121 Current Load on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:03:22] PROBLEM - Host es121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:303) [17:03:22] PROBLEM - cp30 Varnish Backends on cp30 is CRITICAL: 2 backends are down. mw121 mw122 [17:03:23] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:26] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:27] PROBLEM - mw122 PowerDNS Recursor on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:03:27] PROBLEM - mw122 ferm_active on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:03:27] PROBLEM - ping6 on jobchron121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:306) [17:03:29] PROBLEM - mw121 PowerDNS Recursor on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:03:33] PROBLEM - cp31 Stunnel Http for mw121 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:34] PROBLEM - mem121 APT on mem121 is CRITICAL: connect to address 2a10:6740::6:308 port 5666: No route to hostconnect to host 2a10:6740::6:308 port 5666: No route to host [17:03:35] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:35] PROBLEM - graylog121 NTP time on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:03:35] PROBLEM - graylog121 conntrack_table_size on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:03:35] PROBLEM - graylog121 Disk Space on graylog121 is CRITICAL: connect to address 2a10:6740::6:305 port 5666: No route to hostconnect to host 2a10:6740::6:305 port 5666: No route to host [17:03:36] PROBLEM - db121 MariaDB on db121 is CRITICAL: Can't connect to MySQL server on 'db121.miraheze.org' (115) [17:03:39] PROBLEM - db121 Disk Space on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [17:03:39] PROBLEM - Host graylog121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:305) [17:03:40] PROBLEM - Host mem121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:308) [17:03:41] PROBLEM - jobchron121 JobChron Service on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [17:03:42] PROBLEM - mw121 SSH on mw121 is CRITICAL: connect to address 2a10:6740::6:309 and port 22: No route to host [17:03:45] PROBLEM - jobchron121 conntrack_table_size on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [17:03:45] PROBLEM - mw121 ferm_active on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:03:51] PROBLEM - jobchron121 ferm_active on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [17:03:51] PROBLEM - db121 ferm_active on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host PROBLEM - ping6 on db121 is CRITICAL: CRITICAL - Destination Unreachable (2a10:6740::6:302) [17:03:51] PROBLEM - mw121 HTTPS on mw121 is CRITICAL: connect to address 2a10:6740::6:309 and port 443: No route to hostHTTP CRITICAL - Unable to open TCP socket [17:03:51] PROBLEM - mw121 Puppet on mw121 is CRITICAL: connect to address 2a10:6740::6:309 port 5666: No route to hostconnect to host 2a10:6740::6:309 port 5666: No route to host [17:03:56] PROBLEM - cp20 Stunnel Http for mw122 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:03:57] PROBLEM - jobchron121 PowerDNS Recursor on jobchron121 is CRITICAL: connect to address 2a10:6740::6:306 port 5666: No route to hostconnect to host 2a10:6740::6:306 port 5666: No route to host [17:03:57] PROBLEM - mw122 conntrack_table_size on mw122 is CRITICAL: connect to address 2a10:6740::6:310 port 5666: No route to hostconnect to host 2a10:6740::6:310 port 5666: No route to host [17:03:59] PROBLEM - Host mw121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:309) [17:04:00] PROBLEM - cp31 Varnish Backends on cp31 is CRITICAL: 3 backends are down. mw11 mw121 mw122 [17:04:00] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 2 backends are down. mw121 mw122 [17:04:00] PROBLEM - Host mw122 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:310) [17:04:03] PROBLEM - Host jobchron121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:306) [17:04:06] PROBLEM - db121 conntrack_table_size on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [17:04:13] PROBLEM - cp20 Stunnel Http for mw121 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:04:16] PROBLEM - Host db121 is DOWN: CRITICAL - Destination Unreachable (2a10:6740::6:302) [17:04:16] PROBLEM - db121 APT on db121 is CRITICAL: connect to address 2a10:6740::6:302 port 5666: No route to hostconnect to host 2a10:6740::6:302 port 5666: No route to host [17:04:24] PROBLEM - cp21 Varnish Backends on cp21 is CRITICAL: 2 backends are down. mw121 mw122 [17:04:33] PROBLEM - cp31 Stunnel Http for mw122 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:04:42] PROBLEM - cp30 Stunnel Http for mw121 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:04:52] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.854 second response time [17:05:10] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.800 second response time [17:05:23] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.402 second response time [17:05:26] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.989 second response time [17:05:34] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 1.178 second response time [17:06:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.15, 22.11, 22.40 [17:07:23] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.10, 6.00, 5.63 [17:07:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:07:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:07:39] [02puppet] 07Reception123 closed pull request 03#2260: mwscript: add `cargoRecreateData.php` to longscripts - 13https://git.io/JSy9F [17:07:41] [02miraheze/puppet] 07Reception123 pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9YyH [17:07:42] [02miraheze/puppet] 07Universal-Omega 030602585 - mwscript: add `cargoRecreateData.php` to longscripts (#2260) [17:09:21] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.74, 5.53, 5.51 [17:09:25] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.56, 3.93, 3.38 [17:11:20] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.63, 6.41, 5.83 [17:11:21] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.48, 3.37, 3.24 [17:11:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2001:41d0:801:2000::4c25/cpweb [17:11:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.21, 6.78, 7.81 [17:12:27] PROBLEM - cp31 Stunnel Http for mon111 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:12:36] PROBLEM - cp21 Stunnel Http for mw112 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:12:45] PROBLEM - cp31 Stunnel Http for mw102 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:12:52] PROBLEM - cp31 Stunnel Http for mw111 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:12:53] PROBLEM - cp30 Stunnel Http for mw101 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:12:55] PROBLEM - cp21 Stunnel Http for mw101 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:12:59] PROBLEM - cp20 Stunnel Http for mon111 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:08] PROBLEM - cp21 Stunnel Http for mw102 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:11] PROBLEM - cp21 Stunnel Http for mw111 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:16] PROBLEM - cp30 Stunnel Http for mon111 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:16] PROBLEM - cp30 Stunnel Http for mw102 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:20] PROBLEM - cp30 Stunnel Http for mw112 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:26] PROBLEM - cp20 Stunnel Http for mw102 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:32] PROBLEM - cp31 Stunnel Http for mw101 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:35] PROBLEM - cp31 Stunnel Http for mw112 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:38] PROBLEM - cp20 Stunnel Http for mw101 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:38] PROBLEM - cp20 Stunnel Http for mw112 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:38] PROBLEM - cp21 Stunnel Http for mon111 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:13:53] PROBLEM - cp30 Stunnel Http for mw111 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:14:15] PROBLEM - cp20 Stunnel Http for mw111 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:15:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.37, 7.39, 7.96 [17:15:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::929a/cpweb [17:17:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:17:43] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.49, 6.82, 7.93 [17:18:34] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:18:41] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:18:46] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:18:52] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:19:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.83, 8.06, 8.06 [17:19:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.46, 7.11, 7.52 [17:19:41] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:20:23] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:21:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [17:21:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.48, 6.97, 7.43 [17:21:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.69, 7.67, 8.00 [17:22:11] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:22:15] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:22:27] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 5.782 second response time [17:23:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.36, 6.60, 7.54 [17:23:41] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.64, 6.88, 7.68 [17:23:53] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 4.372 second response time [17:24:09] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.327 second response time [17:24:14] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.380 second response time [17:24:50] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 9.894 second response time [17:25:05] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 9.075 second response time [17:25:13] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 9.698 second response time [17:25:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.18, 7.71, 7.90 [17:26:45] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.283 second response time [17:27:14] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.20, 7.26, 7.58 [17:27:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:27:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.91, 7.69, 7.87 [17:28:36] PROBLEM - graylog2 Current Load on graylog2 is WARNING: WARNING - load average: 3.68, 2.92, 1.91 [17:29:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.92, 6.57, 7.30 [17:29:35] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 4.76, 6.06, 6.80 [17:30:37] RECOVERY - graylog2 Current Load on graylog2 is OK: OK - load average: 1.71, 2.48, 1.87 [17:30:38] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:31:00] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [17:32:35] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.366 second response time [17:33:04] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 4.664 second response time [17:33:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.49, 5.79, 6.78 [17:33:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:37:16] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.32, 7.20, 7.11 [17:39:25] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.02, 3.29, 3.05 [17:40:30] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.32, 7.78, 7.18 [17:41:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.17, 6.44, 6.87 [17:41:22] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.03, 3.17, 3.04 [17:42:30] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.91, 7.25, 7.05 [17:43:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.25, 7.43, 7.97 [17:45:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.45, 6.72, 7.63 [17:45:17] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.94, 3.47, 3.18 [17:46:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.68, 7.13, 7.94 [17:47:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.85, 7.84, 7.97 [17:47:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.57, 6.04, 6.67 [17:48:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 10.52, 8.41, 8.32 [17:49:09] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.87, 3.23, 3.16 [17:49:31] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::5ebc/cpweb [17:50:24] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.39, 6.16, 6.58 [17:51:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.19, 7.39, 7.61 [17:51:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:51:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.06, 7.47, 7.08 [17:53:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.76, 7.54, 7.14 [17:53:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.11, 7.60, 7.20 [17:54:59] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.99, 3.61, 3.33 [17:55:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.67, 7.09, 7.02 [17:55:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [17:55:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.65, 7.49, 7.18 [17:55:55] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [17:57:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.30, 7.33, 7.87 [17:58:17] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.53, 7.19, 6.90 [17:58:51] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.98, 3.27, 3.25 [17:59:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.72, 7.75, 7.83 [17:59:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.85, 7.89, 8.00 [17:59:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.67, 8.17, 7.44 [17:59:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:59:41] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.39, 7.83, 7.47 [17:59:47] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:01:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.78, 7.77, 7.94 [18:01:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.40, 7.70, 7.38 [18:02:13] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.68, 6.69, 6.79 [18:05:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 11.00, 8.55, 8.03 [18:05:09] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.74, 7.95, 7.92 [18:05:14] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.94, 7.44, 7.25 [18:06:40] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.53, 3.96, 3.51 [18:07:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [18:07:36] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 6 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [18:07:38] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:07:39] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:08:07] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:08:36] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.90, 3.81, 3.50 [18:09:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.43, 7.76, 7.46 [18:09:35] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.983 second response time [18:09:36] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 1.011 second response time [18:09:43] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.92, 6.36, 6.79 [18:10:09] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 6.416 second response time [18:11:02] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.07, 7.77, 8.00 [18:11:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.64, 7.72, 7.92 [18:11:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [18:11:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:12:04] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.53, 7.22, 6.93 [18:13:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.08, 7.98, 7.99 [18:13:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.56, 7.50, 7.37 [18:13:42] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.89, 7.50, 7.14 [18:14:04] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.21, 8.07, 7.28 [18:14:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.32, 3.40, 3.40 [18:15:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.39, 7.51, 7.79 [18:15:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.82, 7.61, 7.43 [18:17:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 4.74, 6.61, 7.44 [18:17:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.76, 7.43, 7.81 [18:17:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.45, 7.84, 7.53 [18:17:24] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:17:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 198.244.148.90/cpweb, 2607:5300:201:3100::5ebc/cpweb [18:17:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.141.75/cpweb [18:18:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.05, 6.75, 7.97 [18:19:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.97, 7.81, 7.55 [18:19:28] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 4.105 second response time [18:21:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.07, 7.59, 7.71 [18:21:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 10.85, 8.75, 7.92 [18:22:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.29, 23.08, 23.79 [18:23:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.45, 5.69, 6.75 [18:23:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.64, 6.69, 7.36 [18:23:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [18:25:08] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.80, 7.41, 7.53 [18:26:17] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.03, 3.66, 3.43 [18:26:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.83, 23.82, 23.82 [18:27:00] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:27:02] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:27:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.61, 7.23, 7.47 [18:27:08] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:27:14] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:27:31] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [18:27:40] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:28:12] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.83, 3.69, 3.47 [18:28:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.08, 23.33, 23.63 [18:28:54] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.158 second response time [18:29:01] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.471 second response time [18:29:06] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.175 second response time [18:29:12] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.328 second response time [18:29:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [18:29:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:29:45] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.655 second response time [18:30:05] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.61, 7.89, 7.93 [18:30:11] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.56, 3.37, 3.38 [18:31:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.57, 7.89, 7.68 [18:31:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.79, 5.10, 5.97 [18:32:18] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.41, 23.60, 23.61 [18:33:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.99, 7.66, 7.63 [18:33:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.78, 5.46, 5.98 [18:34:04] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 3.77, 4.01, 3.68 [18:35:41] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.92, 7.30, 7.88 [18:36:00] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.38, 3.39, 3.49 [18:37:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.92, 5.11, 5.71 [18:37:39] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.88, 6.31, 6.35 [18:37:56] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.01, 2.95, 3.31 [18:38:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.42, 23.68, 23.73 [18:38:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.83, 7.79, 7.36 [18:39:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 11.20, 8.75, 7.97 [18:39:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.17, 7.01, 7.78 [18:39:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.45, 7.09, 7.60 [18:41:29] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.81, 7.52, 6.79 [18:41:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.78, 6.87, 7.48 [18:42:00] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [18:42:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.80, 23.53, 23.60 [18:42:32] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:42:45] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:42:51] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:43:12] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:43:17] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:43:18] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.45, 5.79, 5.79 [18:43:19] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [18:43:30] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:43:33] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:43:44] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:44:17] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:44:26] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:45:17] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.94, 6.00, 5.88 [18:45:36] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.454 second response time [18:46:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.49, 23.26, 23.49 [18:46:23] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.389 second response time [18:46:38] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.010 second response time [18:46:54] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.082 second response time [18:47:00] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 7.466 second response time [18:47:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.29, 7.10, 7.40 [18:47:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.10, 6.32, 6.01 [18:47:21] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.050 second response time [18:47:22] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.334 second response time [18:47:34] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.403 second response time [18:47:51] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.22, 6.04, 6.80 [18:47:52] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 5.937 second response time [18:48:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 26.09, 24.37, 23.88 [18:48:21] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 5.141 second response time [18:49:12] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.95, 7.82, 7.36 [18:49:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.42, 7.12, 7.18 [18:49:51] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:50:21] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:50:28] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:50:47] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:51:12] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.66, 8.97, 7.85 [18:51:36] RECOVERY - cp30 Stunnel Http for mw102 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.325 second response time [18:51:46] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.335 second response time [18:51:56] RECOVERY - cp31 Stunnel Http for mw111 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.356 second response time [18:51:58] RECOVERY - cp31 Stunnel Http for mon111 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 31581 bytes in 0.335 second response time [18:51:58] RECOVERY - cp31 Stunnel Http for mw101 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14548 bytes in 0.337 second response time [18:52:10] RECOVERY - cp21 Stunnel Http for mw101 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.018 second response time [18:52:13] RECOVERY - cp30 Stunnel Http for mw101 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.329 second response time [18:52:20] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.326 second response time [18:52:21] RECOVERY - cp21 Stunnel Http for mon111 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 31543 bytes in 0.032 second response time [18:52:23] RECOVERY - cp20 Stunnel Http for mon111 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 31543 bytes in 0.033 second response time [18:52:23] RECOVERY - cp20 Stunnel Http for mw121 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.058 second response time [18:52:27] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14571 bytes in 1.038 second response time [18:52:29] RECOVERY - cp21 Stunnel Http for mw102 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.019 second response time [18:52:41] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.016 second response time [18:52:42] RECOVERY - cp20 Stunnel Http for mw101 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.018 second response time [18:52:43] RECOVERY - cp20 Stunnel Http for mw112 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.050 second response time [18:52:54] RECOVERY - cp21 Stunnel Http for mw121 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.020 second response time [18:52:55] RECOVERY - cp30 Stunnel Http for mw112 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.322 second response time [18:52:58] RECOVERY - cp31 Stunnel Http for mw112 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.313 second response time [18:53:04] RECOVERY - cp21 Stunnel Http for mw111 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.026 second response time [18:53:07] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:53:09] RECOVERY - cp31 Stunnel Http for mw102 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.320 second response time [18:53:11] RECOVERY - cp21 Stunnel Http for mw112 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.018 second response time [18:53:20] RECOVERY - cp30 Varnish Backends on cp30 is OK: All 18 backends are healthy [18:53:24] RECOVERY - cp20 Stunnel Http for mw102 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.018 second response time [18:53:26] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 18 backends are healthy [18:53:29] RECOVERY - cp30 Stunnel Http for mw122 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14548 bytes in 0.320 second response time [18:53:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [18:53:32] RECOVERY - cp21 Stunnel Http for mw122 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.016 second response time [18:53:33] RECOVERY - cp20 Stunnel Http for mw111 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.025 second response time [18:53:34] RECOVERY - cp21 Varnish Backends on cp21 is OK: All 18 backends are healthy [18:53:35] RECOVERY - cp31 Stunnel Http for mw122 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.310 second response time [18:53:35] RECOVERY - cp30 Stunnel Http for mon111 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 31581 bytes in 0.338 second response time [18:53:37] RECOVERY - cp30 Stunnel Http for mw111 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.326 second response time [18:53:41] RECOVERY - cp20 Stunnel Http for mw122 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.019 second response time [18:53:46] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.97, 7.00, 6.95 [18:53:55] RECOVERY - cp30 Stunnel Http for mw121 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.326 second response time [18:53:58] RECOVERY - cp31 Stunnel Http for mw121 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.325 second response time [18:53:59] RECOVERY - cp31 Varnish Backends on cp31 is OK: All 18 backends are healthy [18:55:04] RECOVERY - ping6 on mail121 is OK: PING OK - Packet loss = 0%, RTA = 0.82 ms [18:55:07] RECOVERY - jobchron121 Current Load on jobchron121 is OK: OK - load average: 0.15, 0.20, 0.09 [18:55:07] PROBLEM - mail121 Puppet on mail121 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:11] PROBLEM - gluster111 Puppet on gluster111 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:12] PROBLEM - test101 Check Gluster Clients on test101 is CRITICAL: PROCS CRITICAL: 0 processes with args '/usr/sbin/glusterfs' [18:55:12] PROBLEM - mon111 Current Load on mon111 is CRITICAL: CRITICAL - load average: 6.22, 5.38, 2.28 [18:55:13] PROBLEM - db111 Puppet on db111 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:13] PROBLEM - mw111 Puppet on mw111 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:13] PROBLEM - cloud10 Puppet on cloud10 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:14] PROBLEM - mw101 Check Gluster Clients on mw101 is CRITICAL: PROCS CRITICAL: 0 processes with args '/usr/sbin/glusterfs' [18:55:15] PROBLEM - mwtask111 Check Gluster Clients on mwtask111 is CRITICAL: PROCS CRITICAL: 0 processes with args '/usr/sbin/glusterfs' [18:55:21] PROBLEM - mw111 Check Gluster Clients on mw111 is CRITICAL: PROCS CRITICAL: 0 processes with args '/usr/sbin/glusterfs' [18:55:31] PROBLEM - bast101 Puppet on bast101 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:40] PROBLEM - mw101 Puppet on mw101 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [18:55:47] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.02, 7.44, 7.12 [18:56:13] RECOVERY - mon111 Puppet on mon111 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [18:56:15] RECOVERY - puppet111 Puppet on puppet111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:56:19] RECOVERY - mail121 IMAP on mail121 is OK: IMAP OK - 0.016 second response time on 2a10:6740::6:307 port 143 [* OK [CAPABILITY IMAP4rev1 SASL-IR LOGIN-REFERRALS ID ENABLE IDLE LITERAL+ STARTTLS LOGINDISABLED] Dovecot (Debian) ready.] [18:57:07] PROBLEM - mon111 Current Load on mon111 is WARNING: WARNING - load average: 1.73, 3.98, 2.13 [18:57:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [18:57:46] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.48, 7.77, 7.31 [18:58:16] RECOVERY - graylog121 Puppet on graylog121 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [18:58:18] RECOVERY - db101 Puppet on db101 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:59:00] RECOVERY - gluster111 Puppet on gluster111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [18:59:02] RECOVERY - mon111 Current Load on mon111 is OK: OK - load average: 1.16, 2.96, 1.97 [18:59:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.39, 7.84, 7.84 [18:59:50] RECOVERY - gluster121 Puppet on gluster121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:01:27] RECOVERY - mw101 Puppet on mw101 is OK: OK: Puppet is currently enabled, last run 35 seconds ago with 0 failures [19:01:38] RECOVERY - graylog121 HTTPS on graylog121 is OK: HTTP OK: HTTP/1.1 200 OK - 1418 bytes in 0.249 second response time [19:01:42] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.55, 7.95, 7.47 [19:01:53] replica running on db101 [19:01:56] PROBLEM - graylog121 Current Load on graylog121 is CRITICAL: CRITICAL - load average: 15.34, 9.45, 3.98 [19:02:59] RECOVERY - mem121 Puppet on mem121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:03:21] replica running on db111 [19:03:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:03:41] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 4.32, 6.55, 7.01 [19:04:01] RECOVERY - mw122 Puppet on mw122 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [19:05:03] RECOVERY - mw111 Puppet on mw111 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [19:05:06] RECOVERY - bast101 Puppet on bast101 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:05:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.81, 7.26, 7.87 [19:08:37] RECOVERY - phab121 phd on phab121 is OK: PROCS OK: 2 processes with args 'phd' [19:09:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.61, 7.48, 7.91 [19:09:13] RECOVERY - db111 Puppet on db111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:09:29] RECOVERY - cloud12 Puppet on cloud12 is OK: OK: Puppet is currently enabled, last run 40 seconds ago with 0 failures [19:09:38] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.36, 6.14, 6.74 [19:10:10] RECOVERY - bast121 Puppet on bast121 is OK: OK: Puppet is currently enabled, last run 11 seconds ago with 0 failures [19:10:53] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.23, 3.91, 3.49 [19:10:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.03, 6.85, 7.76 [19:11:00] RECOVERY - cloud10 Puppet on cloud10 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:11:58] RECOVERY - mw102 Puppet on mw102 is OK: OK: Puppet is currently enabled, last run 52 seconds ago with 0 failures [19:12:14] RECOVERY - ldap111 Puppet on ldap111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:12:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.71, 22.40, 23.68 [19:12:20] RECOVERY - mw121 Puppet on mw121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:12:41] RECOVERY - mail121 Puppet on mail121 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [19:12:45] RECOVERY - test101 Check Gluster Clients on test101 is OK: PROCS OK: 1 process with args '/usr/sbin/glusterfs' [19:12:49] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.02, 3.53, 3.39 [19:12:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.94, 7.59, 7.91 [19:13:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.31, 7.52, 7.18 [19:13:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.13, 7.41, 7.93 [19:14:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.30, 23.26, 23.86 [19:14:36] RECOVERY - gluster101 Puppet on gluster101 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [19:14:44] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.46, 3.75, 3.48 [19:14:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.31, 7.57, 7.87 [19:15:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.89, 7.50, 7.34 [19:15:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.98, 7.63, 7.26 [19:16:37] RECOVERY - jobchron121 Puppet on jobchron121 is OK: OK: Puppet is currently enabled, last run 20 seconds ago with 0 failures [19:16:40] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.75, 3.46, 3.41 [19:17:49] RECOVERY - es101 Puppet on es101 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:17:52] RECOVERY - test101 Puppet on test101 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [19:18:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 11.23, 8.38, 8.03 [19:19:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.69, 7.59, 7.56 [19:19:23] RECOVERY - mem101 Puppet on mem101 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [19:19:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2001:41d0:801:2000::1b80/cpweb [19:19:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.01, 7.79, 7.96 [19:19:46] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb [19:19:58] RECOVERY - db121 Puppet on db121 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [19:20:39] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:20:43] RECOVERY - mw112 Puppet on mw112 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:21:01] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:21:08] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:21:13] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:21:17] RECOVERY - cloud11 Puppet on cloud11 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [19:21:23] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:21:36] RECOVERY - mwtask111 Puppet on mwtask111 is OK: OK: Puppet is currently enabled, last run 3 seconds ago with 0 failures [19:21:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.73, 7.73, 7.91 [19:22:09] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:22:44] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:23:09] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:23:11] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:23:16] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.27, 7.62, 7.64 [19:23:28] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 6.627 second response time [19:24:12] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.860 second response time [19:24:39] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.432 second response time [19:25:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.12, 7.32, 7.55 [19:25:07] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 3.061 second response time [19:25:15] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 3.911 second response time [19:25:16] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.75, 8.40, 7.92 [19:25:19] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 3.380 second response time [19:25:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.09, 7.55, 7.27 [19:25:52] PROBLEM - graylog121 Current Load on graylog121 is WARNING: WARNING - load average: 0.13, 0.39, 1.85 [19:27:02] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 9.807 second response time [19:27:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 11.66, 8.79, 7.81 [19:27:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.71, 7.87, 7.79 [19:27:17] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.109 second response time [19:27:20] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.909 second response time [19:27:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:27:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.49, 7.44, 7.29 [19:27:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 10.58, 8.01, 7.80 [19:29:25] !log cloud10/11/12: swapoff -a [19:29:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:29:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.92, 8.17, 7.58 [19:29:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [19:29:51] RECOVERY - graylog121 Current Load on graylog121 is OK: OK - load average: 0.11, 0.40, 1.54 [19:31:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.34, 7.33, 7.33 [19:32:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.19, 3.88, 3.70 [19:32:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.48, 7.19, 7.91 [19:33:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 4.42, 6.91, 7.37 [19:33:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.84, 7.67, 7.65 [19:33:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 10.77, 8.15, 7.60 [19:34:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.30, 3.71, 3.66 [19:35:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.11, 7.65, 7.50 [19:35:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.39, 7.38, 7.54 [19:35:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::929a/cpweb [19:36:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.31, 23.62, 23.91 [19:36:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.31, 4.44, 3.94 [19:37:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 4.35, 6.59, 7.15 [19:37:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.87, 8.49, 7.94 [19:37:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:37:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.55, 8.00, 7.70 [19:37:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.86, 7.43, 7.89 [19:39:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.42, 7.00, 7.21 [19:39:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.86, 7.38, 7.59 [19:40:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.31, 3.97, 3.87 [19:40:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.40, 7.71, 7.81 [19:41:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.74, 6.49, 7.00 [19:41:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.99, 5.48, 6.00 [19:41:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.00, 8.28, 7.82 [19:41:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 11.55, 8.32, 8.03 [19:42:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.73, 22.55, 23.13 [19:42:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.53, 7.18, 7.60 [19:43:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.82, 7.66, 7.36 [19:43:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.40, 7.64, 7.63 [19:43:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.90, 7.42, 7.57 [19:44:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.79, 22.13, 22.92 [19:45:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.43, 7.29, 7.25 [19:45:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.39, 6.84, 7.34 [19:45:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.19, 7.21, 7.70 [19:47:06] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 10.62, 8.13, 7.53 [19:47:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.63, 5.24, 5.66 [19:48:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.23, 22.69, 22.92 [19:48:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.89, 8.21, 7.82 [19:49:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.92, 7.53, 7.15 [19:49:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.57, 5.34, 5.64 [19:49:27] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:49:31] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:49:33] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:49:33] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:50:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.96, 22.76, 22.95 [19:51:24] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.343 second response time [19:51:29] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.933 second response time [19:51:31] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.073 second response time [19:51:33] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.631 second response time [19:51:44] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [19:52:16] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.141.75/cpweb [19:52:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.07, 3.44, 3.60 [19:52:55] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [19:54:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.49, 3.35, 3.54 [19:55:33] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:55:48] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:55:54] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:55:54] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:56:00] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:56:44] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:57:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 4.63, 5.45, 6.52 [19:58:05] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:58:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.62, 2.94, 3.33 [19:59:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.78, 4.76, 5.10 [19:59:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 10.52, 7.13, 7.11 [20:00:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.37, 23.58, 23.18 [20:01:14] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 11.23, 8.12, 7.45 [20:01:34] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.68, 7.06, 7.10 [20:01:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.24, 6.83, 6.82 [20:01:59] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [20:02:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:03:51] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.229 second response time [20:04:08] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 4.685 second response time [20:04:13] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.092 second response time [20:04:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.03, 23.06, 23.07 [20:04:19] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.539 second response time [20:04:27] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 0.500 second response time [20:05:52] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:07:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.45, 5.42, 5.23 [20:07:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.99, 7.85, 7.41 [20:08:11] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:09:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.58, 7.33, 7.96 [20:09:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.88, 7.82, 7.60 [20:09:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.37, 7.32, 7.25 [20:10:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.06, 3.51, 3.32 [20:11:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 11.62, 8.58, 8.31 [20:11:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.64, 8.03, 7.69 [20:11:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.93, 7.62, 7.35 [20:11:44] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [20:12:20] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 32.43, 24.24, 23.10 [20:12:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.52, 3.20, 3.23 [20:13:16] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.15, 6.03, 5.57 [20:13:39] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:13:52] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [20:14:02] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:14:14] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:15:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.20, 5.74, 5.52 [20:15:18] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:15:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [20:16:06] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.286 second response time [20:16:19] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 8.763 second response time [20:17:16] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.330 second response time [20:17:18] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.07, 6.17, 5.70 [20:17:32] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:17:43] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:19:58] PROBLEM - cp20 Stunnel Http for mw8 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:21:25] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:21:32] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:21:50] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:21:50] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:21:59] RECOVERY - cp20 Stunnel Http for mw8 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 5.807 second response time [20:22:21] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:22:25] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:22:37] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:23:03] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:23:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.12, 6.64, 7.52 [20:23:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 3.96, 6.49, 7.46 [20:23:37] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 8.303 second response time [20:23:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 4.86, 7.17, 7.85 [20:23:54] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 7.218 second response time [20:27:15] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.15, 5.37, 6.80 [20:27:54] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 9.049 second response time [20:28:28] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23673 bytes in 9.816 second response time [20:29:26] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 6.238 second response time [20:29:34] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 2.98, 5.14, 6.64 [20:29:49] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 9.441 second response time [20:29:50] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.007 second response time [20:30:44] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.805 second response time [20:30:46] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.838 second response time [20:31:03] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.491 second response time [20:31:29] Omg I can’t login to my Phabricator acc [20:31:41] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 6.13, 5.64, 6.76 [20:32:52] Nevermind, wrong code [20:37:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:37:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:37:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.32, 7.47, 7.12 [20:39:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.37, 7.33, 7.11 [20:39:47] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.47, 3.56, 3.20 [20:41:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.22, 6.74, 6.64 [20:41:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.38, 7.57, 7.21 [20:41:42] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.14, 3.43, 3.20 [20:43:35] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.23, 6.11, 6.42 [20:43:38] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.76, 3.16, 3.13 [20:43:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.92, 7.20, 7.12 [20:46:11] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:46:23] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.141.75/cpweb [20:47:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.31, 7.96, 7.43 [20:48:05] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:49:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.60, 7.23, 7.22 [20:51:27] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:51:31] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:51:35] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:51:37] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:51:52] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb [20:53:25] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.544 second response time [20:53:29] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.488 second response time [20:53:36] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 1.132 second response time [20:53:36] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.233 second response time [20:53:46] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:54:08] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:55:33] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 21.39, 19.08, 17.44 [20:55:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 10.12, 7.99, 7.13 [20:55:38] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.56, 7.83, 6.74 [20:55:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.96, 7.88, 7.43 [20:56:53] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:57:33] RECOVERY - cloud5 Current Load on cloud5 is OK: OK - load average: 17.63, 18.71, 17.53 [20:57:34] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.17, 7.49, 6.74 [20:57:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.18, 7.41, 7.04 [20:57:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.75, 7.88, 7.48 [20:58:52] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.346 second response time [20:59:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.88, 8.08, 7.33 [21:00:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.28, 7.00, 7.93 [21:01:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.36, 6.70, 7.91 [21:01:23] dmehus: 21:45 next Friday will be the stop for renames [21:03:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.34, 7.14, 7.91 [21:03:23] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.21, 7.71, 7.04 [21:03:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.63, 7.67, 7.35 [21:05:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.29, 6.34, 7.52 [21:05:20] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.06, 7.26, 6.97 [21:05:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.03, 7.60, 7.35 [21:06:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 17.98, 20.94, 23.73 [21:07:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 4.87, 6.47, 7.80 [21:07:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.60, 7.91, 7.52 [21:09:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.56, 7.60, 7.14 [21:10:45] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.32, 3.57, 3.30 [21:11:01] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.12, 5.68, 6.77 [21:11:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.32, 7.11, 7.75 [21:12:03] RECOVERY - mw101 Check Gluster Clients on mw101 is OK: PROCS OK: 1 process with args '/usr/sbin/glusterfs' [21:12:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 24.02, 22.64, 23.56 [21:12:41] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.56, 3.80, 3.41 [21:14:37] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.75, 3.65, 3.39 [21:15:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.15, 7.20, 7.14 [21:15:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.38, 7.57, 7.84 [21:15:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.76, 8.37, 7.81 [21:16:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.19, 23.16, 23.64 [21:16:33] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.83, 3.98, 3.54 [21:17:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.08, 6.96, 7.06 [21:17:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [21:17:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [21:17:58] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:18:19] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 26.35, 24.32, 24.00 [21:18:23] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:18:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.70, 3.88, 3.56 [21:18:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.61, 7.71, 7.37 [21:19:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.58, 8.03, 7.42 [21:19:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.46, 8.17, 7.99 [21:19:34] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.79, 7.84, 7.78 [21:19:54] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.332 second response time [21:20:11] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:20:22] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.560 second response time [21:20:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.59, 3.36, 3.40 [21:20:49] [02mw-config] 07AgentIsai opened pull request 03#4351: Update sitenotice for downtime notice - 13https://git.io/J9OWH [21:21:23] ssh-agent: want me to merge? [21:21:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.50, 8.51, 8.04 [21:21:36] CosmicAlpha: Please, thank you! [21:21:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.21, 7.68, 7.28 [21:21:53] PROBLEM - cp21 Varnish Backends on cp21 is CRITICAL: 2 backends are down. mw9 mw13 [21:21:54] miraheze/mw-config - AgentIsai the build passed. [21:21:58] [02mw-config] 07Universal-Omega closed pull request 03#4351: Update sitenotice for downtime notice - 13https://git.io/J9OWH [21:22:00] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9OW7 [21:22:01] [02miraheze/mw-config] 07AgentIsai 035398a45 - Update sitenotice for downtime notice (#4351) [21:22:08] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:22:09] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:22:18] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23675 bytes in 8.134 second response time [21:23:06] miraheze/mw-config - Universal-Omega the build passed. [21:23:10] !log [universalomega@mw11] starting deploy of {'pull': 'config', 'config': True} to all [21:23:29] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:23:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.71, 7.55, 7.28 [21:23:51] RECOVERY - cp21 Varnish Backends on cp21 is OK: All 18 backends are healthy [21:24:06] !log [universalomega@mw11] finished deploy of {'pull': 'config', 'config': True} to all - SUCCESS in 56s [21:24:07] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.861 second response time [21:24:08] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:24:11] !log [@test3] starting deploy of {'config': True} to skip [21:24:12] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 0s [21:24:13] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.610 second response time [21:24:14] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:24:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.26, 23.56, 23.89 [21:24:42] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:24:47] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:25:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:25:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [21:25:32] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 3.715 second response time [21:25:36] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:26:09] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.327 second response time [21:26:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.78, 3.49, 3.40 [21:26:41] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 3.727 second response time [21:27:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.89, 7.98, 7.97 [21:28:18] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 28.90, 25.49, 24.54 [21:29:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:29:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 10.76, 8.80, 7.86 [21:30:44] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:30:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.17, 7.41, 7.69 [21:31:01] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:31:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.04, 7.77, 7.92 [21:31:05] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:31:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 4.61, 6.83, 7.72 [21:31:13] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9Ol2 [21:31:15] [02miraheze/mw-config] 07Universal-Omega 03bd6e5dd - 2022! [21:31:33] !log [universalomega@mw11] starting deploy of {'pull': 'config', 'config': True} to all [21:31:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 7.81, 8.10, 8.02 [21:31:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:32:08] !log [universalomega@mw11] finished deploy of {'pull': 'config', 'config': True} to all - SUCCESS in 35s [21:32:20] miraheze/mw-config - Universal-Omega the build passed. [21:32:28] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:32:47] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:32:57] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:33:13] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.516 second response time [21:33:22] PROBLEM - mem121 Puppet on mem121 is CRITICAL: CRITICAL: Puppet has 18 failures. Last run 2 minutes ago with 18 failures. Failed resources (up to 3 shown): File[/etc/apt/trusted.gpg.d/puppetlabs.gpg],File[/usr/local/bin/puppet-enabled],File[authority certificates],File[/etc/apt/apt.conf.d/50unattended-upgrades] [21:33:29] ssh-agent: It's 2022 now! Fixed that on sitenotice. [21:33:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::5ebc/cpweb [21:33:42] oops lol [21:33:47] CosmicAlpha: thanks for catching that! [21:34:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 17.38, 21.73, 23.37 [21:34:23] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.371 second response time [21:34:36] ssh-agent: No problem! It happens lol, I keep almost doing that also. [21:34:43] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.009 second response time [21:34:47] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.507 second response time [21:35:07] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 23462 bytes in 0.455 second response time [21:35:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:37:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.25, 7.42, 7.93 [21:37:16] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 1.79, 4.22, 5.76 [21:38:55] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9O8T [21:38:56] [02miraheze/mw-config] 07Universal-Omega 039d8f8da - Fix closing tag; add that wiki creations will be paused [21:38:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.24, 7.69, 7.56 [21:39:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 10.83, 8.22, 7.77 [21:40:00] !log [universalomega@mw11] starting deploy of {'pull': 'config', 'config': True} to all [21:40:01] miraheze/mw-config - Universal-Omega the build passed. [21:40:34] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.36, 4.00, 3.72 [21:40:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:40:46] !log [universalomega@mw11] finished deploy of {'pull': 'config', 'config': True} to all - SUCCESS in 45s [21:41:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.84, 7.83, 7.58 [21:41:33] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [21:41:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.66, 7.77, 7.97 [21:41:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:42:06] [02mw-config] 07AgentIsai opened pull request 03#4352: We -> Miraheze - 13https://git.io/J9O8C [21:42:19] CosmicAlpha: can you merge that please ^ [21:42:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.25, 3.95, 3.75 [21:42:44] [02mw-config] 07Universal-Omega closed pull request 03#4352: We -> Miraheze - 13https://git.io/J9O8C [21:42:45] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9O88 [21:42:47] [02miraheze/mw-config] 07AgentIsai 030e0a9ee - We -> Miraheze (#4352) [21:42:51] Thank you! [21:42:55] !log [universalomega@mw11] starting deploy of {'pull': 'config', 'config': True} to all [21:43:03] miraheze/mw-config - AgentIsai the build passed. [21:43:04] ssh-agent: no problem [21:43:16] !log [universalomega@mw11] finished deploy of {'pull': 'config', 'config': True} to all - SUCCESS in 21s [21:43:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 1.45, 2.87, 4.70 [21:43:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:43:33] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:43:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.84, 7.32, 7.87 [21:43:39] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:43:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.05, 8.11, 8.09 [21:43:49] miraheze/mw-config - Universal-Omega the build passed. [21:45:41] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.41, 7.37, 7.81 [21:46:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.08, 3.96, 3.79 [21:47:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.62, 8.15, 8.04 [21:47:16] !log [@test101] starting deploy of {'config': True} to skip [21:47:17] !log [@test101] finished deploy of {'config': True} to skip - SUCCESS in 0s [21:47:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::5ebc/cpweb [21:47:32] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb [21:48:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.87, 3.86, 3.77 [21:48:33] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:49:16] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:49:30] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:49:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.19, 7.24, 7.61 [21:49:49] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:50:19] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:50:20] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:50:27] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:50:33] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.12, 3.94, 3.81 [21:51:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.92, 7.38, 7.62 [21:51:36] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 9.582 second response time [21:51:52] !log [@mwtask111] starting deploy of {'config': True} to scsvg [21:51:57] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23538 bytes in 9.116 second response time [21:52:01] !log [@mwtask111] finished deploy of {'config': True} to scsvg - SUCCESS in 8s [21:52:19] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 2.648 second response time [21:52:25] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 3.381 second response time [21:52:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.03, 3.57, 3.69 [21:52:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:53:19] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.129 second response time [21:53:28] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:53:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.38, 8.16, 7.91 [21:54:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 17.85, 19.33, 20.36 [21:54:39] !log [@test3] starting deploy of {'config': True} to skip [21:54:40] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 0s [21:55:31] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:55:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.25, 7.67, 7.68 [21:56:53] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:58:24] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:58:26] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:59:42] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 5.74, 7.34, 7.68 [22:00:22] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.044 second response time [22:00:25] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.341 second response time [22:00:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.90, 3.05, 3.39 [22:01:22] RECOVERY - mem121 Puppet on mem121 is OK: OK: Puppet is currently enabled, last run 28 seconds ago with 0 failures [22:01:32] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:03:31] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:03:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.16, 7.72, 7.86 [22:06:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.19, 3.42, 3.43 [22:09:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.59, 7.80, 7.56 [22:10:00] [02dns] 07RhinosF1 opened pull request 03#241: betaheze: reduce TTL to 10 seconds - 13https://git.io/J9OBm [22:10:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.70, 3.54, 3.43 [22:11:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.35, 7.43, 7.59 [22:11:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.25, 7.37, 7.42 [22:12:33] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.84, 3.33, 3.37 [22:13:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.93, 7.15, 7.46 [22:13:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.27, 8.23, 7.74 [22:13:59] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:14:02] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:14:07] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:14:12] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [22:14:27] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb [22:15:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.11, 7.22, 7.43 [22:15:37] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:15:52] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:15:54] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:15:57] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 3.289 second response time [22:16:04] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.111 second response time [22:16:06] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:16:09] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.607 second response time [22:16:14] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:18:26] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:19:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.92, 7.94, 7.74 [22:19:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::929a/cpweb [22:20:12] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-2 [+0/-0/±1] 13https://git.io/J9OBH [22:20:13] [02miraheze/mw-config] 07RhinosF1 034aee6d3 - DC-Switch: Stop uploads [22:20:15] [02mw-config] 07RhinosF1 created branch 03RhinosF1-patch-2 - 13https://git.io/vbvb3 [22:20:18] [02mw-config] 07RhinosF1 opened pull request 03#4353: DC-Switch: Stop uploads - 13https://git.io/J9OB7 [22:20:29] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:20:34] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:21:11] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [22:21:27] miraheze/mw-config - RhinosF1 the build passed. [22:21:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.48, 7.69, 7.76 [22:21:50] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:22:08] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 7.883 second response time [22:22:10] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 6.057 second response time [22:22:20] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:22:32] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 3.565 second response time [22:22:36] [02mw-config] 07Universal-Omega reviewed pull request 03#4353 commit - 13https://git.io/J9OBx [22:22:38] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 23538 bytes in 6.618 second response time [22:22:48] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-3 [+0/-0/±1] 13https://git.io/J9OBp [22:22:50] [02miraheze/mw-config] 07RhinosF1 03663d37d - DC-Switch: Disable renames & wiki creation [22:22:51] [02mw-config] 07RhinosF1 created branch 03RhinosF1-patch-3 - 13https://git.io/vbvb3 [22:22:58] [02mw-config] 07Universal-Omega reviewed pull request 03#4353 commit - 13https://git.io/J9OBx [22:23:24] CosmicAlpha: no, that was my plan to have it show permission denied [22:23:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.11, 7.68, 7.63 [22:23:46] [02mw-config] 07RhinosF1 opened pull request 03#4354: DC-Switch: Disable renames & wiki creation - 13https://git.io/J9ORf [22:23:49] ok, but still needs the global [22:23:54] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 7.168 second response time [22:23:57] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.535 second response time [22:24:08] CosmicAlpha: will do [22:24:22] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 6.513 second response time [22:24:40] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.597 second response time [22:24:46] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-2 [+0/-0/±1] 13https://git.io/J9ORT [22:24:48] [02miraheze/mw-config] 07RhinosF1 03d6ccc56 - Update LocalSettings.php [22:24:49] [02mw-config] 07RhinosF1 synchronize pull request 03#4353: DC-Switch: Stop uploads - 13https://git.io/J9OB7 [22:24:51] miraheze/mw-config - RhinosF1 the build passed. [22:25:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.35, 7.38, 7.54 [22:25:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.85, 8.17, 7.92 [22:25:53] miraheze/mw-config - RhinosF1 the build passed. [22:26:17] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-4 [+0/-0/±1] 13https://git.io/J9ORm [22:26:18] [02miraheze/mw-config] 07RhinosF1 03373e70a - DC-Switch: All wikis read only [22:26:20] [02mw-config] 07RhinosF1 created branch 03RhinosF1-patch-4 - 13https://git.io/vbvb3 [22:26:21] [02mw-config] 07RhinosF1 opened pull request 03#4355: DC-Switch: All wikis read only - 13https://git.io/J9ORO [22:26:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.56, 3.43, 3.35 [22:26:44] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.931 second response time [22:26:46] [02miraheze/mw-config] 07github-actions[bot] pushed 031 commit to 03RhinosF1-patch-4 [+0/-0/±1] 13https://git.io/J9ORG [22:26:48] [02miraheze/mw-config] 07github-actions 0394eb38d - CI: lint code to MediaWiki standards [22:26:49] [02mw-config] 07github-actions[bot] synchronize pull request 03#4355: DC-Switch: All wikis read only - 13https://git.io/J9ORO [22:27:27] miraheze/mw-config - RhinosF1 the build passed. [22:27:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:27:32] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 23541 bytes in 2.797 second response time [22:27:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.97, 7.76, 7.80 [22:28:01] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:28:17] [02miraheze/mw-config] 07RhinosF1 pushed 032 commits to 03all-wikis-rw [+0/-0/±2] 13https://git.io/J9ORc [22:28:18] [02miraheze/mw-config] 07RhinosF1 03eb362fd - Revert "CI: lint code to MediaWiki standards" [22:28:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.40, 20.79, 19.93 [22:28:20] [02miraheze/mw-config] 07RhinosF1 03fb6db5b - Revert "DC-Switch: All wikis read only" [22:28:21] [02mw-config] 07RhinosF1 created branch 03all-wikis-rw - 13https://git.io/vbvb3 [22:28:23] [02mw-config] 07Universal-Omega reviewed pull request 03#4355 commit - 13https://git.io/J9ORW [22:28:50] CosmicAlpha: DEFAULT is all wikis? [22:29:01] no, just db11, as specified above [22:29:15] I think [22:30:13] CosmicAlpha: probably right [22:30:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 14.90, 19.04, 19.42 [22:30:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.40, 3.37, 3.35 [22:30:44] CosmicAlpha: database.php is easier as less likely to conflict. [22:31:22] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-4 [+0/-0/±1] 13https://git.io/J9ORK [22:31:24] [02miraheze/mw-config] 07RhinosF1 0361f5d8d - Update Database.php [22:31:25] [02mw-config] 07RhinosF1 synchronize pull request 03#4355: DC-Switch: All wikis read only - 13https://git.io/J9ORO [22:31:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 198.244.148.90/cpweb, 2607:5300:201:3100::929a/cpweb [22:31:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.59, 7.45, 7.33 [22:31:39] fair, but yeah the reason I changed your original readonly for SCSVG to what it is now is because of that only being the single DB. Just to be safe anyways. [22:32:36] miraheze/mw-config - RhinosF1 the build passed. [22:33:05] [02miraheze/mw-config] 07RhinosF1 pushed 032 commits to 03all-wikis-rw [+0/-0/±1] 13https://git.io/J9ORH [22:33:06] [02miraheze/mw-config] 07RhinosF1 03bab6390 - Merge branch 'RhinosF1-patch-4' into all-wikis-rw [22:33:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.24, 7.18, 7.27 [22:33:44] [02miraheze/mw-config] 07RhinosF1 pushed 031 commit to 03all-wikis-rw [+0/-0/±1] 13https://git.io/J9ORF [22:33:45] [02miraheze/mw-config] 07RhinosF1 03209c4b0 - Update Database.php [22:34:09] [02mw-config] 07RhinosF1 opened pull request 03#4356: DC-Switch: Make SCSVG RW - 13https://git.io/J9ORp [22:35:20] miraheze/mw-config - RhinosF1 the build passed. [22:35:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.50, 7.70, 7.44 [22:35:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.63, 7.87, 7.73 [22:37:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:37:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.49, 7.28, 7.32 [22:37:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.50, 7.58, 7.65 [22:38:36] PROBLEM - mw111 Puppet on mw111 is CRITICAL: CRITICAL: Puppet has 573 failures. Last run 3 minutes ago with 573 failures. Failed resources (up to 3 shown): File[wiki.worldsofweary.com_private],File[www.portalsofphereon.com],File[www.portalsofphereon.com_private],File[programming.red] [22:39:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.82, 8.03, 7.60 [22:39:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.42, 7.72, 7.68 [22:41:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb [22:42:09] PROBLEM - bast121 Puppet on bast121 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [22:42:26] PROBLEM - mw121 Puppet on mw121 is CRITICAL: CRITICAL: Puppet has 747 failures. Last run 2 minutes ago with 747 failures. Failed resources (up to 3 shown): File[authority certificates],File[/etc/apt/apt.conf.d/50unattended-upgrades],File[/etc/apt/apt.conf.d/20auto-upgrades],File[/etc/modprobe.d/nf_conntrack.conf] [22:42:36] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb [22:43:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 4.86, 7.35, 7.52 [22:44:01] PROBLEM - mw102 Puppet on mw102 is CRITICAL: CRITICAL: Puppet has 486 failures. Last run 3 minutes ago with 486 failures. Failed resources (up to 3 shown): File[wiki.rebirthofthenight.com],File[wiki.rebirthofthenight.com_private],File[beaconspace.unrestrictedlorefare.com],File[beaconspace.unrestrictedlorefare.com_private] [22:44:33] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [22:45:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:45:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.44, 8.06, 7.76 [22:46:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 4.55, 6.72, 7.78 [22:47:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.52, 7.98, 7.91 [22:48:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.97, 8.09, 8.16 [22:49:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 4.97, 6.46, 7.83 [22:49:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 4.66, 7.06, 7.48 [22:51:41] PROBLEM - cp30 Current Load on cp30 is CRITICAL: CRITICAL - load average: 3.10, 2.89, 1.73 [22:53:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 4.68, 7.10, 7.91 [22:53:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 9.15, 7.90, 7.70 [22:53:41] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.46, 6.83, 7.27 [22:54:19] PROBLEM - mw101 Puppet on mw101 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [22:54:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.81, 7.44, 7.89 [22:55:05] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.74, 3.75, 3.39 [22:55:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.04, 7.83, 7.72 [22:55:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.61, 7.22, 7.37 [22:56:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.27, 7.86, 7.97 [22:57:40] RECOVERY - cp30 Current Load on cp30 is OK: OK - load average: 1.03, 1.62, 1.51 [22:58:42] [02puppet] 07Universal-Omega opened pull request 03#2281: site: combine SCSVG db* - 13https://git.io/J9OuT [22:58:56] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.62, 3.88, 3.54 [22:58:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.50, 7.64, 7.88 [22:59:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.24, 6.94, 7.48 [22:59:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.13, 7.51, 7.43 [23:00:01] !log [@mwtask111] starting deploy of {'l10nupdate': True} to scsvg [23:00:02] !log [@test101] starting deploy of {'l10nupdate': True} to skip [23:00:02] !log [@mw11] starting deploy of {'l10nupdate': True} to ovlon [23:00:03] !log [@test3] starting deploy of {'l10nupdate': True} to skip [23:00:31] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:00:52] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.42, 3.90, 3.57 [23:00:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.72, 8.54, 8.21 [23:01:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.90, 7.16, 7.88 [23:01:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.26, 6.75, 7.21 [23:01:26] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:01:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.10, 6.68, 7.15 [23:02:08] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:02:19] RECOVERY - mw101 Puppet on mw101 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:02:35] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:03:04] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [23:03:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 10.30, 8.09, 8.11 [23:04:00] [02puppet] 07paladox closed pull request 03#2281: site: combine SCSVG db* - 13https://git.io/J9OuT [23:04:02] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9Ou6 [23:04:03] [02miraheze/puppet] 07Universal-Omega 03d6fceea - site: combine SCSVG db* (#2281) [23:04:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.62, 7.62, 7.93 [23:05:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.97, 7.97, 7.67 [23:05:45] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:05:54] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 21.85, 19.40, 18.12 [23:06:56] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:07:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.56, 7.20, 7.56 [23:07:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.53, 6.81, 7.63 [23:07:42] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 23540 bytes in 0.345 second response time [23:07:51] RECOVERY - cloud5 Current Load on cloud5 is OK: OK - load average: 17.41, 18.94, 18.14 [23:08:34] RECOVERY - mw111 Puppet on mw111 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:09:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.46, 7.97, 7.74 [23:09:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.33, 7.61, 7.78 [23:10:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [23:10:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.85, 7.56, 7.60 [23:11:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.94, 7.58, 7.57 [23:11:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.13, 7.24, 7.56 [23:11:15] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.73, 7.42, 7.75 [23:11:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 3.32, 6.29, 7.15 [23:11:58] RECOVERY - mw102 Puppet on mw102 is OK: OK: Puppet is currently enabled, last run 51 seconds ago with 0 failures [23:12:10] RECOVERY - bast121 Puppet on bast121 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [23:12:20] RECOVERY - mw121 Puppet on mw121 is OK: OK: Puppet is currently enabled, last run 56 seconds ago with 0 failures [23:12:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.42, 7.80, 7.71 [23:13:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.70, 7.76, 7.81 [23:13:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.47, 7.14, 7.47 [23:13:55] [02puppet] 07Universal-Omega opened pull request 03#2282: site: split SCSVG and Ovlon MediaWiki servers - 13https://git.io/J9OzV [23:14:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.43, 3.45, 3.76 [23:14:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.05, 7.69, 7.67 [23:15:18] [02puppet] 07Universal-Omega synchronize pull request 03#2282: site: split SCSVG and Ovlon MediaWiki servers - 13https://git.io/J9OzV [23:15:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.61, 6.79, 7.12 [23:15:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [23:15:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.14, 6.58, 7.21 [23:16:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.08, 7.42, 7.57 [23:17:06] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.06, 7.87, 7.88 [23:17:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 5.70, 7.07, 7.53 [23:17:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 4.77, 6.17, 6.87 [23:18:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.86, 3.88, 3.85 [23:19:07] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.17, 8.15, 7.97 [23:20:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 9.77, 7.91, 7.66 [23:21:02] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.62, 7.80, 7.95 [23:21:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.12, 7.45, 7.57 [23:21:35] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 3.64, 5.51, 6.52 [23:21:40] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.30, 5.79, 6.67 [23:22:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.98, 7.46, 7.52 [23:25:24] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [23:25:33] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 22.33, 19.20, 17.73 [23:25:35] PROBLEM - mw11 Current Load on mw11 is CRITICAL: CRITICAL - load average: 8.14, 7.18, 6.97 [23:25:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 14.50, 9.30, 7.81 [23:26:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.94, 6.99, 7.23 [23:27:21] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:27:55] PROBLEM - test3 APT on test3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:28:14] [02puppet] 07Universal-Omega closed pull request 03#2279: Revert all OOM mitigations - 13https://git.io/JS7Nr [23:28:33] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.95, 3.75, 3.85 [23:28:58] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.98, 7.16, 7.25 [23:29:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.56, 7.13, 7.48 [23:29:51] RECOVERY - test3 APT on test3 is OK: APT OK: 19 packages available for upgrade (0 critical updates). [23:30:14] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:31:01] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.05, 6.60, 7.21 [23:31:33] RECOVERY - cloud5 Current Load on cloud5 is OK: OK - load average: 16.18, 18.80, 18.24 [23:32:10] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.560 second response time [23:32:58] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 10.44, 8.06, 7.51 [23:33:01] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 10.54, 8.35, 7.80 [23:38:04] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/J9OAf [23:38:05] [02miraheze/puppet] 07paladox 03a4d60db - mediawiki: Increase opcache.revalidate_freq to 30 for php [23:38:07] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [23:38:08] [02puppet] 07paladox opened pull request 03#2283: mediawiki: Increase opcache.revalidate_freq to 30 for php - 13https://git.io/J9OAG [23:38:14] [02puppet] 07paladox closed pull request 03#2283: mediawiki: Increase opcache.revalidate_freq to 30 for php - 13https://git.io/J9OAG [23:38:16] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J9Ox8 [23:38:17] [02miraheze/puppet] 07paladox 03fdf7828 - mediawiki: Increase opcache.revalidate_freq to 30 for php (#2283) [23:38:19] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [23:38:20] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [23:38:32] !log [@test3] finished deploy of {'l10nupdate': True} to skip - SUCCESS in 2310s [23:39:01] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:40:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 1.53, 2.65, 3.27 [23:40:55] !log [@mw11] finished deploy of {'l10nupdate': True} to ovlon - SUCCESS in 2452s [23:40:59] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb [23:41:35] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 5.24, 7.09, 7.58 [23:41:36] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:41:41] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 6 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [23:41:56] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:42:05] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:42:07] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 9.651 second response time PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 6.733 second response time [23:42:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.33, 20.46, 18.53 [23:42:55] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:43:14] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:43:34] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [23:45:14] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 4.420 second response time [23:46:09] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 9.739 second response time [23:46:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.13, 19.91, 18.75 [23:46:34] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:46:50] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::929a/cpweb [23:49:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.27, 6.70, 7.82 [23:49:34] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 3.77, 5.33, 6.68 [23:50:17] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/J93oO [23:50:18] [02miraheze/puppet] 07paladox 0377a23ef - mediawiki: fix globling certs [23:50:19] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.10, 20.66, 19.37 [23:50:27] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.596 second response time [23:50:27] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.835 second response time [23:50:30] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 4.329 second response time [23:50:39] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.336 second response time [23:50:42] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:51:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.36, 7.20, 7.88 [23:55:40] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 9.34, 7.77, 7.93 [23:57:30] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::5ebc/cpweb [23:57:40] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 4.89, 6.91, 7.62 [23:58:19] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 17.66, 19.72, 19.69 [23:59:15] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 9.17, 6.93, 7.33 [23:59:30] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online