[00:02:12] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 32.09, 23.82, 20.80 [00:04:11] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.08, 23.54, 21.09 [00:06:10] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 26.66, 24.65, 21.78 [00:08:37] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:10:36] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:10:58] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb [00:11:51] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.82, 3.40, 3.00 [00:13:39] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:13:43] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:13:51] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.96, 3.22, 2.99 [00:14:06] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:14:11] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:14:11] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:15:37] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.812 second response time [00:15:45] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.931 second response time [00:16:06] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 1.426 second response time [00:16:08] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.113 second response time [00:16:10] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 20524 bytes in 0.155 second response time [00:17:49] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.83, 3.86, 3.29 [00:18:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:21:16] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.77, 7.41, 6.34 [00:21:53] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:21:54] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:21:59] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:03] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.019 second response time [00:22:05] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:28] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:22:29] SRE, I keep getting persistent 502s on `loginwiki` [00:22:34] likely related to ^ though [00:22:41] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:44] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:22:46] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:22:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:23:03] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:23:08] dmehus: just let it calm down [00:23:11] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.56, 7.49, 6.50 [00:23:17] PROBLEM - cp20 Stunnel Http for mw8 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:23:40] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 328 bytes in 0.057 second response time [00:23:47] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.91, 3.56, 3.39 [00:23:58] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.476 second response time [00:24:00] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 8.845 second response time [00:24:03] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.742 second response time [00:24:09] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.198 second response time [00:24:10] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.233 second response time [00:26:50] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 1.751 second response time [00:26:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:26:55] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 2.562 second response time [00:26:56] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 1.365 second response time [00:27:26] RECOVERY - cp20 Stunnel Http for mw8 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 9.038 second response time [00:28:56] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.85, 6.62, 6.47 [00:29:27] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 5.986 second response time [00:29:45] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.41, 3.05, 3.23 [00:29:48] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 20514 bytes in 2.512 second response time [00:30:24] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:31:53] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSec9 [00:31:54] [02miraheze/puppet] 07paladox 039fb2171 - mariadb: Support http proxy [00:31:56] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:31:57] [02puppet] 07paladox opened pull request 03#2202: mariadb: Support http proxy - 13https://git.io/JSecQ [00:32:42] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeCk [00:32:44] [02miraheze/puppet] 07paladox 036e323ea - Update db101.yaml [00:32:45] [02puppet] 07paladox synchronize pull request 03#2202: mariadb: Support http proxy - 13https://git.io/JSecQ [00:32:53] [02puppet] 07paladox closed pull request 03#2202: mariadb: Support http proxy - 13https://git.io/JSecQ [00:32:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/JSeCY [00:32:56] [02miraheze/puppet] 07paladox 038e36558 - mariadb: Support http proxy (#2202) [00:32:57] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:32:59] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [00:33:53] PROBLEM - cp31 Current Load on cp31 is CRITICAL: CRITICAL - load average: 36.06, 10.22, 4.17 [00:38:02] PROBLEM - cp30 Current Load on cp30 is CRITICAL: CRITICAL - load average: 2.78, 4.34, 2.52 [00:38:17] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [00:38:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::1b80/cpweb [00:40:16] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:42:58] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.95, 6.34, 5.46 [00:43:55] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.11, 21.88, 23.53 [00:44:13] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+1/-0/±0] 13https://git.io/JSe4L [00:44:14] [02miraheze/puppet] 07paladox 0308ef918 - Install ldap111 [00:44:16] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:44:17] [02puppet] 07paladox opened pull request 03#2203: Install ldap111 - 13https://git.io/JSe4t [00:44:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSe4c [00:44:41] [02miraheze/puppet] 07paladox 03f9b54b7 - Update site.pp [00:44:42] [02puppet] 07paladox synchronize pull request 03#2203: Install ldap111 - 13https://git.io/JSe4t [00:45:07] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSe4R [00:45:08] [02miraheze/puppet] 07paladox 034769aab - Update ldap111.yaml [00:45:10] [02puppet] 07paladox synchronize pull request 03#2203: Install ldap111 - 13https://git.io/JSe4t [00:45:14] [02puppet] 07paladox closed pull request 03#2203: Install ldap111 - 13https://git.io/JSe4t [00:45:15] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/JSe4a [00:45:17] [02miraheze/puppet] 07paladox 0336c6d96 - Install ldap111 (#2203) [00:45:18] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [00:45:20] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [00:46:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:46:54] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.32, 6.01, 5.55 [00:50:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [00:52:08] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb [00:52:38] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [00:54:02] PROBLEM - cp30 Current Load on cp30 is WARNING: WARNING - load average: 0.72, 1.16, 1.89 [00:54:35] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.06, 4.08, 3.59 [00:54:41] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.227 second response time [00:54:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [00:56:06] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [00:57:49] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: CRITICAL - load average: 25.66, 23.91, 23.54 [00:57:53] PROBLEM - cp31 Current Load on cp31 is WARNING: WARNING - load average: 0.48, 0.87, 1.96 [00:58:02] RECOVERY - cp30 Current Load on cp30 is OK: OK - load average: 0.34, 0.74, 1.55 [00:59:48] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.98, 22.29, 22.96 [01:00:33] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.00, 3.78, 3.63 [01:01:53] RECOVERY - cp31 Current Load on cp31 is OK: OK - load average: 0.32, 0.69, 1.64 [01:02:32] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.61, 3.87, 3.67 [01:02:48] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±0] 13https://git.io/JSez2 [01:02:50] [02miraheze/puppet] 07paladox 03cec2b76 - Create cloud10.yaml [01:02:57] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±0] 13https://git.io/JSezo [01:02:58] [02miraheze/puppet] 07paladox 034d39584 - Create cloud11.yaml [01:03:06] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±0] 13https://git.io/JSezX [01:03:08] [02miraheze/puppet] 07paladox 03e31da1f - Create cloud12.yaml [01:03:08] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.05, 6.81, 6.29 [01:04:32] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.27, 3.64, 3.61 [01:05:05] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.96, 6.69, 6.32 [01:12:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.24, 5.14, 5.97 [01:13:42] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 15.84, 18.21, 20.04 [01:16:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+1/-0/±0] 13https://git.io/JSewW [01:17:00] [02miraheze/puppet] 07paladox 03a3d28d2 - Install mon111 [01:17:01] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [01:17:03] [02puppet] 07paladox opened pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:18:02] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSew9 [01:18:04] [02miraheze/puppet] 07paladox 03bbbafca - Update grafana.ini.erb [01:18:05] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:18:27] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.15, 3.79, 3.64 [01:19:39] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [01:20:40] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSerS [01:20:42] [02miraheze/puppet] 07paladox 03608938e - Update init.pp [01:20:43] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:21:03] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSerA [01:21:04] [02miraheze/puppet] 07paladox 037b66eee - Update mon111.yaml [01:21:06] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:21:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeok [01:21:27] [02miraheze/puppet] 07paladox 03d53ee62 - Update mon111.yaml [01:21:29] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:21:38] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [01:21:39] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.12, 19.94, 19.93 [01:21:50] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.140.43/cpweb [01:22:07] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.55, 5.51, 5.62 [01:22:25] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.22, 3.87, 3.74 [01:23:15] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 7.51, 6.61, 6.21 [01:23:19] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.05, 7.31, 6.60 [01:24:13] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeKY [01:24:15] [02miraheze/puppet] 07paladox 03c29b8f4 - Update config.ini.php.erb [01:24:16] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:25:13] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 6.35, 6.38, 6.17 [01:25:18] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeKa [01:25:19] [02miraheze/puppet] 07paladox 03410fd16 - Update init.pp [01:25:19] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.20, 6.49, 6.38 [01:25:21] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:25:35] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeKi [01:25:37] [02miraheze/puppet] 07paladox 03075ec3e - Update mon111.yaml [01:25:38] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:27:42] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 15.81, 19.34, 19.97 [01:27:47] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [01:28:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.05, 5.84, 5.81 [01:30:07] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.28, 6.02, 5.87 [01:34:44] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSePB [01:34:45] [02miraheze/puppet] 07paladox 03531a245 - Update resources.ini.erb [01:34:47] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:35:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSePX [01:35:38] [02miraheze/puppet] 07paladox 036403040 - Update init.pp [01:35:40] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:36:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 6.00, 5.71, 5.76 [01:36:18] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSePx [01:36:20] [02miraheze/puppet] 07paladox 03bf44102 - Update icinga2.pp [01:36:21] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:36:41] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeXO [01:36:42] [02miraheze/puppet] 07paladox 0390b1c42 - Update mon111.yaml [01:36:44] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:38:11] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeXM [01:38:13] [02miraheze/puppet] 07paladox 03438ff73 - Update icinga2.conf [01:38:14] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:38:20] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.18, 3.65, 3.62 [01:38:35] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeXF [01:38:36] [02miraheze/puppet] 07paladox 031f050e1 - Update grafana.conf [01:38:38] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [01:40:19] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.30, 3.52, 3.57 [01:41:03] PROBLEM - cp30 Current Load on cp30 is CRITICAL: CRITICAL - load average: 2.26, 2.40, 1.54 [01:42:59] PROBLEM - cp30 Current Load on cp30 is WARNING: WARNING - load average: 0.96, 1.85, 1.44 [01:44:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.30, 4.29, 5.00 [01:44:18] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.74, 2.99, 3.34 [01:44:56] RECOVERY - cp30 Current Load on cp30 is OK: OK - load average: 0.63, 1.44, 1.33 [01:50:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.32, 5.05, 5.15 [01:51:31] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.83, 6.56, 5.56 [01:52:11] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-5 [+0/-0/±1] 13https://git.io/JSeS0 [01:52:12] [02miraheze/puppet] 07paladox 0387a6a52 - cloud: Add support for ferm [01:52:13] [02puppet] 07paladox created branch 03paladox-patch-5 - 13https://git.io/vbiAS [01:52:14] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.54, 3.85, 3.54 [01:52:15] [02puppet] 07paladox opened pull request 03#2205: cloud: Add support for ferm - 13https://git.io/JSeSu [01:52:31] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.58, 21.06, 19.40 [01:52:41] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.84, 6.87, 5.67 [01:53:14] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JSeSH [01:53:16] [02miraheze/puppet] 07paladox 03c409eff - cloud: Add support for ferm [01:53:17] [02puppet] 07paladox created branch 03paladox-patch-6 - 13https://git.io/vbiAS [01:53:19] [02puppet] 07paladox opened pull request 03#2206: cloud: Add support for ferm - 13https://git.io/JSeSQ [01:53:23] paladox: what's the difference between test and https://github.com/miraheze/puppet/pull/2182? [01:53:24] [url] cloud: Add support for ferm by paladox · Pull Request #2182 · miraheze/puppet · GitHub | github.com [01:53:32] *that [01:53:40] based on a newer commit [01:53:51] i would have had to check it out locally to rebase [01:54:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.59, 4.37, 4.87 [01:54:14] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.76, 3.98, 3.64 [01:54:17] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JSe9q [01:54:18] [02miraheze/puppet] 07paladox 0311423da - Update cloud10.yaml [01:54:20] [02puppet] 07paladox synchronize pull request 03#2206: cloud: Add support for ferm - 13https://git.io/JSeSQ [01:54:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JSe9O [01:54:27] [02miraheze/puppet] 07paladox 03bf3c7c1 - Update cloud11.yaml [01:54:29] [02puppet] 07paladox synchronize pull request 03#2206: cloud: Add support for ferm - 13https://git.io/JSeSQ [01:54:29] paladox: Ah. Just was curious. Makes sense. Thanks! [01:54:31] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.81, 19.29, 18.93 [01:54:35] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.19, 6.39, 5.62 [01:54:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-6 [+0/-0/±1] 13https://git.io/JSe9n [01:54:38] [02miraheze/puppet] 07paladox 03dd81d5c - Update cloud12.yaml [01:54:40] [02puppet] 07paladox synchronize pull request 03#2206: cloud: Add support for ferm - 13https://git.io/JSeSQ [01:54:46] [02puppet] 07paladox closed pull request 03#2205: cloud: Add support for ferm - 13https://git.io/JSeSu [01:54:48] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-5 [01:54:49] [02puppet] 07paladox deleted branch 03paladox-patch-5 - 13https://git.io/vbiAS [01:54:53] [02puppet] 07paladox closed pull request 03#2182: cloud: Add support for ferm - 13https://git.io/JyD0D [01:54:54] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [01:54:56] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [01:55:03] [02puppet] 07paladox closed pull request 03#2206: cloud: Add support for ferm - 13https://git.io/JSeSQ [01:55:05] [02puppet] 07paladox deleted branch 03paladox-patch-6 - 13https://git.io/vbiAS [01:55:06] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-6 [01:55:08] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±4] 13https://git.io/JSe9E [01:55:09] [02miraheze/puppet] 07paladox 0338d4a51 - cloud: Add support for ferm (#2206) [01:55:31] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 6.12, 6.31, 5.68 [01:56:07] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSe9d [01:56:08] [02miraheze/puppet] 07paladox 0341dd43a - Install cloud role on cloud1[012] [01:56:13] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.44, 4.62, 3.92 [01:58:14] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSeHr [01:58:16] [02miraheze/puppet] 07paladox 03f9b122c - Update site.pp [01:58:18] [02puppet] 07paladox synchronize pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [02:00:12] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.21, 3.90, 3.78 [02:02:11] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.42, 4.22, 3.89 [02:04:10] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 1.27, 3.16, 3.55 [02:05:13] [02puppet] 07paladox closed pull request 03#2204: Install mon111 - 13https://git.io/JSew8 [02:05:15] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±10] 13https://git.io/JSe7S [02:05:16] [02miraheze/puppet] 07paladox 0394d0090 - Install mon111 (#2204) [02:05:18] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [02:05:19] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [02:06:09] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.53, 2.89, 3.39 [02:06:16] PROBLEM - cp30 Current Load on cp30 is CRITICAL: CRITICAL - load average: 2.15, 1.66, 1.20 [02:08:12] RECOVERY - cp30 Current Load on cp30 is OK: OK - load average: 1.05, 1.40, 1.16 [02:08:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSe55 [02:08:40] [02miraheze/puppet] 07paladox 033463962 - irc: Add support for bullseye [02:08:42] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [02:08:43] [02puppet] 07paladox opened pull request 03#2207: irc: Add support for bullseye - 13https://git.io/JSe5b [02:08:51] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSe5p [02:08:53] [02miraheze/puppet] 07paladox 033d9f736 - Update ircecho.pp [02:08:55] [02puppet] 07paladox synchronize pull request 03#2207: irc: Add support for bullseye - 13https://git.io/JSe5b [02:09:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSedC [02:10:00] [02miraheze/puppet] 07paladox 0326536c7 - Update init.pp [02:10:01] [02puppet] 07paladox synchronize pull request 03#2207: irc: Add support for bullseye - 13https://git.io/JSe5b [02:10:07] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.62, 3.57, 3.58 [02:10:18] [02puppet] 07paladox closed pull request 03#2207: irc: Add support for bullseye - 13https://git.io/JSe5b [02:10:20] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/JSedR [02:10:21] [02miraheze/puppet] 07paladox 035b8d506 - irc: Add support for bullseye (#2207) [02:10:23] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [02:10:24] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [02:13:42] [02miraheze/dns] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSeF4 [02:13:44] [02miraheze/dns] 07paladox 0367da2a0 - Setup icings-new and grafana-new temporarily [02:14:06] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.39, 3.94, 3.73 [02:16:05] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.68, 4.00, 3.79 [02:22:04] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.08, 3.93, 3.80 [02:24:03] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.24, 3.57, 3.67 [02:24:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSeAB [02:25:00] [02miraheze/puppet] 07paladox 035cff164 - Fix installing mariadb apt repo [02:25:01] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [02:25:03] [02puppet] 07paladox opened pull request 03#2208: Fix installing mariadb apt repo - 13https://git.io/JSeA0 [02:27:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSexI [02:27:26] [02miraheze/puppet] 07paladox 0328452e2 - Update packages.pp [02:27:28] [02puppet] 07paladox synchronize pull request 03#2208: Fix installing mariadb apt repo - 13https://git.io/JSeA0 [02:27:32] [02puppet] 07paladox closed pull request 03#2208: Fix installing mariadb apt repo - 13https://git.io/JSeA0 [02:27:33] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/JSexq [02:27:35] [02miraheze/puppet] 07paladox 03953f13b - Fix installing mariadb apt repo (#2208) [02:27:36] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [02:27:38] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [02:28:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [02:34:00] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.03, 3.85, 3.71 [02:35:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [02:36:29] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSehN [02:36:31] [02miraheze/puppet] 07paladox 03d586468 - Update init.pp [02:39:58] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.42, 3.91, 3.82 [02:47:01] SRE, just as an FYI, I'm putting through three global renames. Hopefully the jobrunner can handle it :D [02:50:20] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSvJe [02:50:21] [02miraheze/puppet] 07paladox 03ad940da - Update mon111.yaml [02:53:53] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.32, 3.14, 3.40 [02:59:52] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSvTH [02:59:53] [02miraheze/puppet] 07paladox 0341e7f7c - Update mon111.yaml [03:02:49] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.71, 3.41, 3.41 [03:05:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:08:08] !log test [03:08:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:09:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:10:46] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.29, 3.57, 3.48 [03:12:29] !log sudo -u www-data php /srv/mediawiki/w/maintenance/rebuildall.php --wiki=sunrinwiki [03:12:33] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [03:12:45] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.73, 3.76, 3.58 [03:14:45] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.11, 3.02, 3.32 [03:15:02] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSvqv [03:15:04] [02miraheze/puppet] 07paladox 032865bad - monitoring: Support binding to ipv6 for api [03:15:05] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [03:15:07] [02puppet] 07paladox opened pull request 03#2209: monitoring: Support binding to ipv6 for api - 13https://git.io/JSvqJ [03:16:10] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSvql [03:16:11] [02miraheze/puppet] 07paladox 035ead7a6 - Update icinga2.pp [03:16:13] [02puppet] 07paladox synchronize pull request 03#2209: monitoring: Support binding to ipv6 for api - 13https://git.io/JSvqJ [03:16:22] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSvq0 [03:16:23] [02miraheze/puppet] 07paladox 0357b7864 - Update mon111.yaml [03:16:25] [02puppet] 07paladox synchronize pull request 03#2209: monitoring: Support binding to ipv6 for api - 13https://git.io/JSvqJ [03:16:28] [02puppet] 07paladox closed pull request 03#2209: monitoring: Support binding to ipv6 for api - 13https://git.io/JSvqJ [03:16:30] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±3] 13https://git.io/JSvqu [03:16:31] [02miraheze/puppet] 07paladox 03fa27c3d - monitoring: Support binding to ipv6 for api (#2209) [03:16:33] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [03:16:34] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [03:23:41] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.56, 3.87, 3.56 [03:25:40] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.80, 3.45, 3.45 [03:27:29] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSvO2 [03:27:30] [02miraheze/puppet] 07paladox 03832ab90 - grafana: Use local gpg key [03:27:32] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [03:27:33] [02puppet] 07paladox opened pull request 03#2210: grafana: Use local gpg key - 13https://git.io/JSvOa [03:27:39] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.77, 3.24, 3.37 [03:28:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-0/±0] 13https://git.io/JSvOy [03:28:27] [02miraheze/puppet] 07paladox 030ed1c1a - Add files via upload [03:28:28] [02puppet] 07paladox synchronize pull request 03#2210: grafana: Use local gpg key - 13https://git.io/JSvOa [03:29:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:29:46] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-1/±0] 13https://git.io/JSv3k [03:29:47] [02miraheze/puppet] 07paladox 03b5cc97a - move modules/grafana/files/gpg.key modules/grafana/files/grafana.gpg [03:29:49] [02puppet] 07paladox synchronize pull request 03#2210: grafana: Use local gpg key - 13https://git.io/JSvOa [03:29:59] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/JSv3m [03:30:00] [02miraheze/puppet] 07paladox 0338a199a - grafana: Use local gpg key (#2210) [03:30:02] [02puppet] 07paladox closed pull request 03#2210: grafana: Use local gpg key - 13https://git.io/JSvOa [03:30:04] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [03:30:05] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [03:33:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSvs3 [03:33:59] [02miraheze/puppet] 07paladox 03eed99d4 - Update grafana.gpg [03:35:36] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.27, 3.60, 3.39 [03:39:34] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.61, 3.84, 3.57 [03:39:56] PROBLEM - mw10 Current Load on mw10 is CRITICAL: CRITICAL - load average: 8.46, 6.03, 5.38 [03:41:34] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.60, 3.98, 3.65 [03:41:52] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.81, 6.02, 5.46 [03:42:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:43:33] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.13, 3.62, 3.55 [03:47:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:49:31] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.54, 3.04, 3.33 [03:50:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [03:53:32] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.89, 6.35, 5.48 [03:55:31] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.89, 5.82, 5.39 [03:59:27] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.39, 3.98, 3.63 [04:01:27] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.28, 3.75, 3.59 [04:03:34] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSv4c [04:03:36] [02miraheze/puppet] 07paladox 0396066bc - nrpe: Allow 2a10:6740::6:205 (mon111) [04:07:24] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.50, 3.16, 3.38 [04:18:17] [02puppet] 07Universal-Omega opened pull request 03#2211: check_reverse_dns: use `dns.resolver.Resolver.resolve()` - 13https://git.io/JSvEh [04:18:29] paladox: ^ [04:18:59] CosmicAlpha: which icinga did you see that on [04:19:16] i'm not sure if that happens on a newer version only (e.g. bullseye compared to buster) [04:19:31] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.06, 6.40, 5.79 [04:19:40] paladox: seems to be new version only. [04:19:54] Ok we can merge later, don't want to break the old cluster :) [04:20:03] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 23.05, 20.50, 18.87 [04:20:12] paladox: yeah sounds good! [04:20:13] [02puppet] 07paladox commented on pull request 03#2211: check_reverse_dns: use `dns.resolver.Resolver.resolve()` - 13https://git.io/JSvuo [04:21:32] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.93, 6.27, 5.82 [04:21:36] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.44, 5.59, 4.96 [04:22:18] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.54, 3.80, 3.46 [04:24:01] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.27, 19.17, 18.82 [04:24:01] [02puppet] 07Universal-Omega edited pull request 03#2211: check_reverse_dns: use `dns.resolver.Resolver.resolve()` - 13https://git.io/JSvEh [04:25:26] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.27, 5.79, 5.17 [04:26:17] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.47, 3.38, 3.40 [04:27:20] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.47, 5.94, 5.32 [04:28:04] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSv2v [04:28:05] [02miraheze/puppet] 07paladox 03042e2ac - Update gluster121.yaml [04:28:16] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSv2U [04:28:17] [02miraheze/puppet] 07paladox 035df4047 - Update gluster111.yaml [04:29:15] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.02, 6.00, 5.41 [04:30:04] [02miraheze/CreateWiki] 07Universal-Omega pushed 031 commit to 03Universal-Omega-patch-1 [+0/-0/±1] 13https://git.io/JSv26 [04:30:06] [02miraheze/CreateWiki] 07Universal-Omega 03ee58870 - Use selectorother type for canned responses to support custom [04:30:07] [02CreateWiki] 07Universal-Omega created branch 03Universal-Omega-patch-1 - 13https://git.io/vpJTL [04:30:09] [02CreateWiki] 07Universal-Omega opened pull request 03#272: Use selectorother type for canned responses to support custom - 13https://git.io/JSv2X [04:31:10] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.01, 5.79, 5.42 [04:35:13] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.58, 3.57, 3.47 [04:35:31] miraheze/CreateWiki - Universal-Omega the build passed. [04:36:55] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.52, 4.41, 4.94 [04:40:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [04:42:02] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.18, 6.58, 5.98 [04:44:02] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.37, 5.93, 5.81 [04:45:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [04:49:12] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.06, 3.93, 3.62 [04:51:11] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.73, 3.63, 3.56 [04:52:15] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.23, 4.77, 4.62 [04:52:47] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.64, 20.07, 18.73 [04:53:11] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.40, 3.80, 3.62 [04:54:46] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 19.79, 19.66, 18.72 [04:55:11] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.49, 3.58, 3.56 [04:56:07] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.48, 5.38, 4.87 [04:58:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 5.02, 5.01, 4.79 [04:59:11] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.12, 3.06, 3.37 [05:05:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:12:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:21:12] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.39, 3.39, 3.04 [05:23:11] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.95, 3.56, 3.15 [05:25:11] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.83, 3.22, 3.07 [05:32:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:41:06] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.40, 3.41, 3.25 [05:43:05] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.67, 3.60, 3.32 [05:44:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [05:47:04] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.25, 3.59, 3.39 [05:53:02] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.92, 3.36, 3.36 [05:59:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [06:04:48] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 8.67, 6.04, 4.14 [06:06:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [06:10:52] RECOVERY - db13 Disk Space on db13 is OK: DISK OK - free space: / 55999 MB (12% inode=98%); [06:22:48] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 7.27, 7.91, 6.99 [06:22:50] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.16, 3.62, 3.05 [06:24:47] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 8.22, 8.05, 7.15 [06:24:49] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.77, 3.19, 2.96 [06:26:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 7.72, 7.79, 7.16 [06:26:52] PROBLEM - db13 Disk Space on db13 is WARNING: DISK WARNING - free space: / 47627 MB (10% inode=98%); [06:30:47] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.57, 3.60, 3.20 [06:34:45] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 1.59, 2.99, 3.09 [06:36:47] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 9.41, 7.82, 7.33 [06:40:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 6.72, 7.59, 7.37 [06:42:48] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 10.58, 8.59, 7.76 [06:45:43] PROBLEM - db13 Current Load on db13 is WARNING: WARNING - load average: 7.48, 6.27, 5.71 [06:49:42] RECOVERY - db13 Current Load on db13 is OK: OK - load average: 6.67, 6.67, 6.02 [07:04:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 6.73, 7.17, 7.92 [07:10:47] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 8.78, 7.15, 7.60 [07:11:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:14:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 7.62, 7.61, 7.70 [07:18:48] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 8.91, 7.90, 7.77 [07:19:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:26:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 7.03, 7.69, 7.83 [07:39:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:40:48] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 8.50, 7.83, 7.62 [07:42:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 7.46, 7.79, 7.64 [07:44:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [07:48:47] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 8.63, 7.63, 7.56 [07:50:47] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 7.01, 7.45, 7.51 [07:58:47] RECOVERY - db11 Current Load on db11 is OK: OK - load average: 5.77, 5.78, 6.65 [07:59:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:12:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:16:29] PROBLEM - db13 APT on db13 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [08:16:41] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 6.87, 6.86, 6.54 [08:18:26] PROBLEM - db13 Current Load on db13 is CRITICAL: CRITICAL - load average: 15.74, 12.92, 8.00 [08:18:29] RECOVERY - db13 APT on db13 is OK: APT OK: 9 packages available for upgrade (0 critical updates). [08:18:41] RECOVERY - db11 Current Load on db11 is OK: OK - load average: 5.76, 6.25, 6.34 [08:18:52] RECOVERY - db13 Disk Space on db13 is OK: DISK OK - free space: / 51337 MB (11% inode=98%); [08:22:10] [02puppet] 07RhinosF1 commented on pull request 03#2211: check_reverse_dns: use `dns.resolver.Resolver.resolve()` - 13https://git.io/JSfSf [08:22:25] RECOVERY - db13 Current Load on db13 is OK: OK - load average: 2.02, 6.71, 6.54 [08:22:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:26:14] CosmicAlpha: we also should adjust for ipv6 [08:26:19] I've downtimed for now [08:32:07] RhinosF1: yeah, I can try and update PR tomorrow. [08:43:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:48:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:54:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [08:57:38] !log downtime sslhost on icinga-new for 7 days [08:57:41] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [09:04:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:29:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:34:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:48:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [09:53:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:21:14] PROBLEM - db11 APT on db11 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [10:21:17] alerting : [FIRING:1] (!sre MediaWiki Exception Rate yes mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:21:27] Urgh [10:21:47] JohnLewis: think db11 down [10:22:56] PROBLEM - db11 Current Load on db11 is CRITICAL: CRITICAL - load average: 7.81, 9.82, 7.07 [10:22:58] db11 is fine [10:23:09] RECOVERY - db11 APT on db11 is OK: APT OK: 9 packages available for upgrade (0 critical updates). [10:23:14] JohnLewis: OOM [10:23:31] But its fine now [10:23:50] Ye it came back up [10:24:03] It was down when I pinged [10:24:55] PROBLEM - db11 Current Load on db11 is WARNING: WARNING - load average: 3.15, 7.42, 6.51 [10:26:17] ok : [RESOLVED] (!sre MediaWiki Exception Rate yes mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:26:54] RECOVERY - db11 Current Load on db11 is OK: OK - load average: 1.62, 5.46, 5.90 [10:44:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [10:50:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.00, 5.16, 4.28 [10:52:36] [02miraheze/puppet] 07JohnFLewis pushed 031 commit to 03master [+3/-0/±2] 13https://git.io/JSJwE [10:52:37] [02miraheze/puppet] 07JohnFLewis 031f85995 - add bast121 + motd [10:54:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.57, 4.84, 4.35 [10:54:12] !log forcefully change password for BrowserStackAccount via CLI to one from K12 [10:54:17] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [10:54:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [11:11:06] [02miraheze/dns] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±2] 13https://git.io/JSJXz [11:11:08] [02miraheze/dns] 07JohnFLewis 030f7ebdd - add bastion.mh.o [11:12:33] [02miraheze/dns] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSJ1f [11:12:34] [02miraheze/dns] 07JohnFLewis 0329accf6 - service types [11:12:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [11:22:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [11:32:18] [02puppet] 07RhinosF1 opened pull request 03#2212: rdns: use ipv6 - 13https://git.io/JSJHW [11:32:30] JohnLewis: can you whack merge on ^ [11:33:00] [02puppet] 07RhinosF1 commented on pull request 03#2211: check_reverse_dns: use `dns.resolver.Resolver.resolve()` - 13https://git.io/JSJHo [11:33:01] [02puppet] 07JohnFLewis closed pull request 03#2212: rdns: use ipv6 - 13https://git.io/JSJHW [11:33:03] [02miraheze/puppet] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSJHi [11:33:04] [02miraheze/puppet] 07RhinosF1 0304030c6 - rdns: use ipv6 (#2212) [11:33:15] .in 30mins check icinga-new [11:33:15] RhinosF1: Okay, will remind at 2022-01-01 - 12:03:15GMT [12:03:15] RhinosF1: check icinga-new [12:04:32] [02puppet] 07RhinosF1 opened pull request 03#2213: rdns: fix other NS - 13https://git.io/JSJhw [12:04:59] JohnLewis: also ^ [12:17:31] [02puppet] 07JohnFLewis closed pull request 03#2213: rdns: fix other NS - 13https://git.io/JSJhw [12:17:32] [02miraheze/puppet] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSUJk [12:17:34] [02miraheze/puppet] 07RhinosF1 0397727c6 - rdns: fix other NS (#2213) [12:22:04] .in 30mins check icinga-new [12:22:04] RhinosF1: Okay, will remind at 2022-01-01 - 12:52:04GMT [12:23:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [12:28:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [12:43:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [12:45:31] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.31, 6.67, 5.16 [12:47:31] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.19, 5.97, 5.08 [12:52:06] RhinosF1: check icinga-new [12:52:22] Looks better [12:56:01] [02miraheze/puppet] 07JohnFLewis pushed 032 commits to 03master [+2/-0/±2] 13https://git.io/JSU3C [12:56:03] [02miraheze/puppet] 07JohnFLewis 031937a86 - push squid/http proxy [12:56:04] [02miraheze/puppet] 07JohnFLewis 03b5ed897 - Merge branch 'master' of github.com:/miraheze/puppet [12:58:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [13:11:48] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 21.60, 18.29, 16.93 [13:12:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.94, 5.16, 4.80 [13:12:19] [02miraheze/puppet] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSUnX [13:12:21] [02miraheze/puppet] 07JohnFLewis 03cd969ef - syntax fixes for squid [13:13:22] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUcv [13:13:23] [02miraheze/puppet] 07paladox 03941b200 - base: Set /etc/gitconfig for the new cluster [13:13:25] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [13:13:26] [02puppet] 07paladox opened pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:13:47] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.30, 17.28, 16.72 [13:13:55] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUcm [13:13:56] [02miraheze/puppet] 07paladox 03ed08c8c - Update init.pp [13:13:58] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:14:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.29, 4.75, 4.69 [13:14:22] paladox: I need to head out for a few hours, are you okay to work with RhinosF1 in merging the mw server PRs for puppet and ensure they work + deploy so then he can take over the rest? [13:14:42] syre [13:14:45] *sure [13:15:16] Great, thank you :) [13:15:57] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-0/±0] 13https://git.io/JSUcM [13:15:59] [02miraheze/puppet] 07paladox 03b3d5076 - Create gitconfig [13:16:00] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:16:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [13:17:40] also fyi for all, if you want your life to be easier - modify any configs for new Infra to use bastion.miraheze.org as a proxy jump - if you set it up to SSH directly to port 22 on each server, it will stop working in the next few weeks [13:18:07] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.22, 5.28, 4.89 [13:19:31] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.67, 3.46, 2.94 [13:20:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.51, 5.33, 4.96 [13:21:31] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.14, 3.08, 2.87 [13:21:57] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUWG [13:21:59] [02miraheze/puppet] 07paladox 038045d08 - Update init.pp [13:22:00] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:22:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.92, 5.04, 4.89 [13:23:21] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-1/±0] 13https://git.io/JSUW1 [13:23:23] [02miraheze/puppet] 07paladox 037f08649 - Update and rename modules/base/files/git/gitconfig to modules/base/templates/git/gitconfig [13:23:24] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:24:12] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUWp [13:24:13] [02miraheze/puppet] 07paladox 037f7b088 - Update gluster101.yaml [13:24:15] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:24:46] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUlL [13:24:47] [02miraheze/puppet] 07paladox 0327868f1 - Update gluster111.yaml [13:24:49] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:25:03] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUlZ [13:25:05] [02miraheze/puppet] 07paladox 032821dd9 - Update gluster101.yaml [13:25:06] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:25:15] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUll [13:25:17] [02miraheze/puppet] 07paladox 033da4def - Update gluster121.yaml [13:25:18] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:25:36] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUlu [13:25:38] [02miraheze/puppet] 07paladox 0388cedf3 - Update db101.yaml [13:25:39] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:25:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUlr [13:26:00] [02miraheze/puppet] 07paladox 0362e4f64 - Update mon111.yaml [13:26:01] [02puppet] 07paladox synchronize pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:26:21] [02puppet] 07paladox closed pull request 03#2214: base: Set /etc/gitconfig for the new cluster - 13https://git.io/JSUcf [13:26:23] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [13:26:24] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [13:26:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±6] 13https://git.io/JSUlD [13:26:27] [02miraheze/puppet] 07paladox 03d61049f - base: Set /etc/gitconfig for the new cluster (#2214) [13:26:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [13:27:41] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSU8L [13:27:43] [02miraheze/puppet] 07paladox 03807b379 - Update puppet111.yaml [13:28:12] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-1/±0] 13https://git.io/JSU8c [13:28:13] [02miraheze/puppet] 07paladox 0384dd7d4 - Rename gitconfig to gitconfig.erb [13:28:30] RhinosF1: are you around? [13:31:09] JohnLewis: because of the firewall on db11-13, i'm not sure it'll install correctly [13:31:28] (on MW*) [13:32:37] why do they need db11-13 to install mediawiki? [13:33:53] Puppet will fail i think or maybe it was changed the last time i did it. [13:34:43] You don’t need db* to run the git pulls in puppet though [13:35:24] Anyway, whether it fails or not, it just needs to be setup so it runs [13:35:28] oh [13:35:30] ok [13:36:11] You’re thinking of the deploy tool which checks if what has been deployed works, which is entirely independent of puppet [13:36:33] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:36:41] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:36:55] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:37:08] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:37:20] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:37:32] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:37:51] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:38:04] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:40:57] [02puppet] 07paladox synchronize pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:41:35] [02puppet] 07paladox closed pull request 03#2177: mediawiki: add new servers - 13https://git.io/JyiSs [13:41:36] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+8/-0/±2] 13https://git.io/JSU0f [13:41:37] lets begin! [13:41:38] [02miraheze/puppet] 07RhinosF1 03a1eb2cc - mediawiki: add new servers (#2177) [13:41:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [13:42:52] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.54, 5.87, 5.15 [13:44:29] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-1/±0] 13https://git.io/JSU0d [13:44:30] [02miraheze/puppet] 07paladox 03c741048 - It's mwtask111 not mwtask101 [13:44:47] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.10, 5.55, 5.12 [13:47:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSUE6 [13:47:27] [02miraheze/puppet] 07paladox 0387f16bd - mwtask111 not mwtask101 part 2 [13:49:31] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.42, 19.37, 17.97 [13:51:30] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 20.11, 19.39, 18.13 [13:52:26] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.80, 5.38, 5.11 [13:53:18] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.78, 3.32, 2.97 [13:54:21] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.77, 5.33, 5.14 [13:55:17] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.28, 3.18, 2.96 [13:56:16] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.59, 4.78, 4.97 [13:59:59] paladox: don't worry about deploy tool errors [14:00:07] ok [14:00:11] As long as our team have access, I'm not bothered [14:00:45] paladox: do task last though [14:00:53] ok [14:01:03] * RhinosF1 is with grandma at moment [14:01:05] i've already got mwtask111 puppet running [14:01:19] Notice: /Stage[main]/Gluster::Apt/Exec[apt_update_gluster]/returns: W: Failed to fetch https://download.gluster.org/pub/gluster/glusterfs/9/LATEST/Debian/bullseye/amd64/apt/dists/bullseye/InRelease Could not resolve 'download.gluster.org' [14:01:26] i need to put that behind a proxy [14:03:19] paladox: ok, if mw* don't exist then it will 100% fail but it should skip deploy tool next run [14:03:27] ok [14:03:38] Ignore anything deploy related [14:05:43] paladox: please do memcache + jobchron too. I think John changed IPs but if you get time the stunnel/Prometheus/varnish PRs should be safe to merge once the updated IPs are there [14:11:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:15:27] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUrY [14:15:28] [02miraheze/puppet] 07paladox 03a6ef90f - gluster: Proxy download.gluster.org if http_proxy is set [14:15:30] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:15:31] [02puppet] 07paladox opened pull request 03#2215: gluster: Proxy download.gluster.org if http_proxy is set - 13https://git.io/JSUrO [14:17:03] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-0/±0] 13https://git.io/JSUr1 [14:17:05] [02miraheze/puppet] 07paladox 038f4bff6 - Create 01gluster.erb [14:17:06] [02puppet] 07paladox synchronize pull request 03#2215: gluster: Proxy download.gluster.org if http_proxy is set - 13https://git.io/JSUrO [14:17:18] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/JSUrS [14:17:19] [02miraheze/puppet] 07paladox 038bb28e4 - gluster: Proxy download.gluster.org if http_proxy is set (#2215) [14:17:21] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:17:22] [02puppet] 07paladox closed pull request 03#2215: gluster: Proxy download.gluster.org if http_proxy is set - 13https://git.io/JSUrO [14:17:24] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [14:18:02] Notice: /Stage[main]/Mediawiki/Exec[MediaWiki Config Sync]/returns: (UNKNOWN) [51.195.236.249] 5071 (?) : Network is unreachable [14:18:07] RhinosF1: ^ [14:18:30] it's trying to connect to mon2 [14:18:52] paladox: why [14:19:09] looks like your script? [14:19:32] https://www.irccloud.com/pastebin/7BaSo0K9/ [14:21:04] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUoF [14:21:05] [02miraheze/puppet] 07paladox 036f9d819 - Set gluster::only_ipv6: true for the new mw cluster [14:21:07] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:21:08] [02puppet] 07paladox opened pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:21:19] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUoh [14:21:21] [02miraheze/puppet] 07paladox 039f8788d - Update mw122.yaml [14:21:22] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:21:30] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUKf [14:21:31] [02miraheze/puppet] 07paladox 03be11898 - Update mw121.yaml [14:21:33] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:21:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUKU [14:21:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:21:40] [02miraheze/puppet] 07paladox 03ef66c63 - Update mw112.yaml [14:21:42] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:21:50] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUKL [14:21:51] [02miraheze/puppet] 07paladox 032aa3cdc - Update mw111.yaml [14:21:53] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:22:01] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUKO [14:22:02] [02miraheze/puppet] 07paladox 0355469c4 - Update mw102.yaml [14:22:04] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:22:07] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.88, 5.08, 4.65 [14:22:09] paladox: it shouldn't connect to monitoring [14:22:10] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUKG [14:22:12] [02miraheze/puppet] 07paladox 034e9f2b6 - Update mwtask111.yaml [14:22:13] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:22:21] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUKW [14:22:22] [02miraheze/puppet] 07paladox 03fdf628b - Update test111.yaml [14:22:24] [02puppet] 07paladox synchronize pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:22:33] [02puppet] 07paladox closed pull request 03#2216: Set gluster::only_ipv6: true for the new mw cluster - 13https://git.io/JSUob [14:22:35] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:22:36] The sudoers error is an issue though [14:22:36] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±8] 13https://git.io/JSUKB [14:22:37] [02miraheze/puppet] 07paladox 036c47466 - Set gluster::only_ipv6: true for the new mw cluster (#2216) [14:22:39] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [14:23:11] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.77, 3.31, 2.96 [14:23:40] RhinosF1: https://github.com/miraheze/puppet/blob/3b0191964ddb0b311491911fa73c3d514d24dffb/modules/base/files/logsalmsg#L10 [14:23:41] [url] puppet/logsalmsg at 3b0191964ddb0b311491911fa73c3d514d24dffb · miraheze/puppet · GitHub | github.com [14:24:07] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 4.29, 4.92, 4.65 [14:24:27] RhinosF1: can we use the hostname? [14:25:12] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.00, 2.93, 2.87 [14:26:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU6M [14:27:00] [02miraheze/puppet] 07paladox 03cea7cae - Use new gluster server on new mw cluster [14:27:01] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:27:03] [02puppet] 07paladox opened pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:27:20] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU6d [14:27:21] [02miraheze/puppet] 07paladox 030b8e0f5 - Update mw102.yaml [14:27:22] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:27:29] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU6N [14:27:30] [02miraheze/puppet] 07paladox 0335cd38d - Update mw111.yaml [14:27:32] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:27:41] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU6j [14:27:43] [02miraheze/puppet] 07paladox 0312cdea3 - Update mw112.yaml [14:27:44] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:27:55] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUiJ [14:27:57] [02miraheze/puppet] 07paladox 03510364d - Update mw121.yaml [14:27:58] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:28:10] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUit [14:28:12] [02miraheze/puppet] 07paladox 034a52d67 - Update mw122.yaml [14:28:13] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:28:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUiO [14:28:26] [02miraheze/puppet] 07paladox 03a13014d - Update mwtask111.yaml [14:28:28] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:28:35] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUiG [14:28:37] [02miraheze/puppet] 07paladox 0337095fd - Update test111.yaml [14:28:38] [02puppet] 07paladox synchronize pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:28:42] [02puppet] 07paladox closed pull request 03#2217: Use new gluster server on new mw cluster - 13https://git.io/JSU6S [14:28:44] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:28:45] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±8] 13https://git.io/JSUic [14:28:47] [02miraheze/puppet] 07paladox 030a3d365 - Use new gluster server on new mw cluster (#2217) [14:28:48] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [14:31:01] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUiN [14:31:03] [02miraheze/puppet] 07paladox 0394e370d - Support mon111 in logsalmsg [14:31:04] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:31:06] [02puppet] 07paladox opened pull request 03#2218: Support mon111 in logsalmsg - 13https://git.io/JSUiA [14:31:10] RhinosF1: look good to you ^? [14:31:31] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPJ [14:31:33] [02miraheze/puppet] 07paladox 03efcb7b3 - Fix path [14:31:34] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [14:31:36] [02puppet] 07paladox opened pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:31:50] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPI [14:31:52] [02miraheze/puppet] 07paladox 0382d3f7d - Update mw102.yaml [14:31:53] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:32:03] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPY [14:32:05] [02miraheze/puppet] 07paladox 039aff7db - Update mw111.yaml [14:32:06] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:32:13] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUP3 [14:32:15] [02miraheze/puppet] 07paladox 03701c49b - Update mw112.yaml [14:32:16] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:32:31] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPc [14:32:32] [02miraheze/puppet] 07paladox 0334c05db - Update mw121.yaml [14:32:34] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:32:40] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPl [14:32:41] [02miraheze/puppet] 07paladox 038b4f6bf - Update mw122.yaml [14:32:43] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:32:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPR [14:32:55] [02miraheze/puppet] 07paladox 03a765fa3 - Update test111.yaml [14:32:57] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:33:05] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSUPz [14:33:07] [02miraheze/puppet] 07paladox 03d095235 - Update mwtask111.yaml [14:33:08] [02puppet] 07paladox synchronize pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:33:15] [02puppet] 07paladox closed pull request 03#2219: Fix path - 13https://git.io/JSUPU [14:33:16] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±8] 13https://git.io/JSUPV [14:33:18] [02miraheze/puppet] 07paladox 03891246f - Fix path (#2219) [14:33:19] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [14:33:21] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [14:34:00] [02puppet] 07JohnFLewis reviewed pull request 03#2218 commit - 13https://git.io/JSUPy [14:34:28] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSUP5 [14:34:29] [02miraheze/puppet] 07paladox 0387fab4a - gluster: Make sure proxy file is installed before installing apt repo [14:35:24] [02puppet] 07paladox reviewed pull request 03#2218 commit - 13https://git.io/JSUXT [14:35:56] [02puppet] 07paladox closed pull request 03#2218: Support mon111 in logsalmsg - 13https://git.io/JSUiA [14:36:02] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:36:04] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [14:36:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:38:13] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUXx [14:38:15] [02miraheze/puppet] 07paladox 03adf4278 - Support changing logsalmsg ip per host [14:38:16] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:38:18] [02puppet] 07paladox opened pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:38:46] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-1/±0] 13https://git.io/JSU1t [14:38:47] [02miraheze/puppet] 07paladox 0370a9884 - Update and rename modules/base/files/logsalmsg to modules/base/templates/logsalmsg.erb [14:38:49] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:39:49] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU1z [14:39:51] [02miraheze/puppet] 07paladox 0379d979a - Update mw101.yaml [14:39:52] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:39:59] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU1w [14:40:00] [02miraheze/puppet] 07paladox 03b5f26ab - Update mw102.yaml [14:40:02] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:40:09] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU1i [14:40:11] [02miraheze/puppet] 07paladox 031ebf18e - Update mw111.yaml [14:40:12] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:40:20] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU1M [14:40:22] [02miraheze/puppet] 07paladox 033d8559f - Update mw121.yaml [14:40:24] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:40:30] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU1S [14:40:31] [02miraheze/puppet] 07paladox 032210de7 - Update mw112.yaml [14:40:33] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:40:39] Paladox: Im wondering if we can name it better [14:40:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:40:40] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU17 [14:40:42] [02miraheze/puppet] 07paladox 0317c769a - Update mw122.yaml [14:40:43] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:41:08] JohnLewis: hmm, maybe. Though i haven't come up with any other names apart from logsalmsg_ip [14:41:28] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUMk [14:41:29] [02miraheze/puppet] 07paladox 037489fea - Update mwtask111.yaml [14:41:31] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:41:31] It’s the IP of the monitoring host, is there a way we can just get it automatically? [14:41:38] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUMm [14:41:39] [02miraheze/puppet] 07paladox 03b11dbd4 - Update test111.yaml [14:41:41] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:41:54] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUMn [14:41:55] [02miraheze/puppet] 07paladox 03068f1c8 - Update mon111.yaml [14:41:57] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:42:47] JohnLewis: Maybe hostname? [14:42:52] Worst case, we could just do an ip6 resolve on the hostname for the monitoring server [14:43:21] paladox: thanks [14:44:04] !log [paladox@test3] [14:44:08] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [14:44:34] !log [paladox@test3] [14:44:37] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [14:44:39] hostname works [14:44:55] JohnLewis: ^? So should i just create a hiera value of "use_new_mon"? [14:45:25] Can do [14:45:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [14:45:40] We’re going to have fun cleaning up puppet after all this migration though [14:46:20] A lot of fun [14:46:21] :P [14:49:00] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUy0 [14:49:02] [02miraheze/puppet] 07paladox 037f53ef3 - Update logsalmsg.erb [14:49:03] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:49:41] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUy1 [14:49:42] [02miraheze/puppet] 07paladox 034f6ac42 - Update init.pp [14:49:44] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:49:59] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUyH [14:50:01] [02miraheze/puppet] 07paladox 03a75dc22 - Update mon111.yaml [14:50:02] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:50:08] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUyF [14:50:10] [02miraheze/puppet] 07paladox 0352ca974 - Update mw101.yaml [14:50:11] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:50:17] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUyA [14:50:19] [02miraheze/puppet] 07paladox 037da33ce - Update mw102.yaml [14:50:20] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:50:28] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUSe [14:50:29] [02miraheze/puppet] 07paladox 0307604e5 - Update mw111.yaml [14:50:31] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:53:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUSp [14:53:38] [02miraheze/puppet] 07paladox 0382a98b4 - Update mw112.yaml [14:53:40] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:53:48] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU9f [14:53:50] [02miraheze/puppet] 07paladox 0362fadc7 - Update mw121.yaml [14:53:51] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:53:58] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU9I [14:53:59] [02miraheze/puppet] 07paladox 033df801d - Update mw122.yaml [14:54:01] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:54:13] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU9m [14:54:15] [02miraheze/puppet] 07paladox 0383113da - Update mwtask111.yaml [14:54:16] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:54:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSU9G [14:54:27] [02miraheze/puppet] 07paladox 03953fb35 - Update test111.yaml [14:54:29] [02puppet] 07paladox synchronize pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:55:18] [02puppet] 07paladox closed pull request 03#2220: Support changing logsalmsg ip per host - 13https://git.io/JSUXp [14:55:20] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-1/±10] 13https://git.io/JSU9w [14:55:21] [02miraheze/puppet] 07paladox 03a08921b - Support changing logsalmsg ip per host (#2220) [14:55:23] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [14:55:24] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [15:00:21] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSUQm [15:00:23] [02miraheze/puppet] 07paladox 03107f7c8 - mediawiki::extensionsetup: Make sure nodejs is installed [15:05:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:12:58] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.50, 3.70, 3.09 [15:13:18] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSUFZ [15:13:19] [02miraheze/puppet] 07paladox 03d8faa70 - Force registry.npmjs.org to use ipv6 [15:13:21] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [15:13:22] [02puppet] 07paladox opened pull request 03#2221: Force registry.npmjs.org to use ipv6 - 13https://git.io/JSUFn [15:14:11] JohnLewis: ^ [15:14:58] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.25, 3.15, 2.96 [15:18:07] nodejs 17 changes its default dns lookup [15:18:12] (from ipv4 to ipv6) [15:18:31] [02puppet] 07paladox closed pull request 03#2221: Force registry.npmjs.org to use ipv6 - 13https://git.io/JSUFn [15:18:33] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSUbj [15:18:34] [02miraheze/puppet] 07paladox 03140514c - Force registry.npmjs.org to use ipv6 (#2221) [15:18:36] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [15:18:37] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [15:18:58] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.74, 5.50, 4.95 [15:20:25] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:20:52] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.66, 5.83, 5.13 [15:21:00] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:21:55] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:22:05] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:22:47] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.69, 5.76, 5.18 [15:22:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:23:29] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:23:30] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:24:43] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.71, 4.95, 4.95 [15:25:03] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:25:05] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.007 second response time [15:25:16] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:25:20] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:25:27] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.657 second response time [15:25:33] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 4.471 second response time [15:25:40] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:25:42] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:25:45] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:25:51] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:25:59] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:26:06] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:26:09] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 6.016 second response time [15:26:10] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.259 second response time [15:26:26] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:26:27] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:26:38] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 328 bytes in 0.246 second response time [15:27:04] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.007 second response time [15:27:16] PROBLEM - cp20 Stunnel Http for mw8 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:27:16] RhinosF1: mw101 and mwtask111 up [15:27:19] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:27:37] but puppet failing on mwtask111 because it's trying to install a npm package and using ipv4 which it carn't... [15:28:03] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 5.453 second response time [15:28:11] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 5 backends are down. mw8 mw9 mw11 mw12 mw13 [15:28:33] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 6.426 second response time [15:28:38] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 1.627 second response time [15:28:38] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.006 second response time [15:28:54] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.303 second response time [15:29:04] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.016 second response time [15:29:15] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.508 second response time [15:29:18] RECOVERY - cp20 Stunnel Http for mw8 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 7.637 second response time [15:29:18] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:29:19] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.469 second response time [15:29:20] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.679 second response time [15:29:46] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 20514 bytes in 0.252 second response time [15:29:49] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 0.307 second response time [15:29:50] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 20524 bytes in 0.625 second response time [15:29:50] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.012 second response time [15:30:08] paladox: ok [15:30:10] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.321 second response time [15:30:11] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 9 backends are healthy [15:30:12] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.031 second response time [15:30:25] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.392 second response time [15:30:44] paladox: can you also do mem/jobchron [15:30:52] And there is the stunnel PRs [15:31:24] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 7.869 second response time [15:31:29] icinga-new won't resolve on mobile data [15:32:05] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:32:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:34:22] hmm github over ssh won't work then [15:35:35] !sre I'm getting persistent 502s and now 503s on `metawiki` for a continuous, uninterrupted past 10 minutes [15:35:36] PROBLEM - mw10 MediaWiki Rendering on mw10 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:35:53] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 328 bytes in 0.041 second response time [15:35:56] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:35:56] Same here. [15:35:57] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.45, 6.70, 5.51 [15:36:05] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:36:05] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:36:08] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:36:11] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: 3 backends are down. mw9 mw10 mw13 [15:36:11] And some pages just won't load properly. [15:36:17] yep [15:36:23] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.01, 5.94, 5.31 [15:36:25] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:36:27] I can't even get them to load at all though :P [15:36:38] RhinosF1: you'll need to pre-install the npm package me thinks. [15:36:48] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:36:48] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:36:52] paladox: ok [15:36:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:36:56] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:36:57] dmehus: I can see [15:36:57] also... i think that we're going to have to use gerrithub for the ssl certs [15:36:59] Something obviously needs to be fixed as soon as possible. [15:37:05] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:37:06] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:37:09] RhinosF1: okay, that's good [15:37:11] PROBLEM - cp31 Varnish Backends on cp31 is CRITICAL: 3 backends are down. mw8 mw12 mw13 [15:37:14] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:37:15] paladox: can you restart php-fpm [15:37:15] PROBLEM - cp31 Stunnel Http for mw11 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:37:19] ok [15:37:22] PROBLEM - cp30 Varnish Backends on cp30 is CRITICAL: 2 backends are down. mw8 mw11 [15:37:22] Everywhere [15:37:27] dmehus: I'm mobile [15:37:35] PROBLEM - cp21 Varnish Backends on cp21 is CRITICAL: 1 backends are down. mw12 [15:37:53] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 20514 bytes in 4.710 second response time [15:37:57] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 6.70, 6.22, 5.44 [15:37:59] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 8.511 second response time [15:38:07] RhinosF1, okay, I gather you won't be able to pre-install the npm package either then? [15:38:08] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 20524 bytes in 4.162 second response time [15:38:11] !log restart php7.3/7.4 on mw* [15:38:11] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 9 backends are healthy [15:38:14] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 6.196 second response time [15:38:17] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.49, 5.88, 5.37 [15:38:23] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.344 second response time [15:38:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:38:45] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.700 second response time [15:38:48] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.297 second response time [15:38:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:38:53] dmehus: thats new infra work [15:38:54] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.009 second response time [15:39:01] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.346 second response time [15:39:04] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.071 second response time [15:39:07] RECOVERY - cp31 Varnish Backends on cp31 is OK: All 9 backends are healthy [15:39:09] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.414 second response time [15:39:11] paladox is imaging the severs and attempting to get puppet running [15:39:13] RECOVERY - cp31 Stunnel Http for mw11 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.446 second response time [15:39:20] RECOVERY - cp30 Varnish Backends on cp30 is OK: All 9 backends are healthy [15:39:26] i'll do the other mw* then do mem* [15:39:33] RhinosF1, yeah, unrelated to my report, just making a comment that if you're mobile and unable to attend to one thing, you won't be able to attend to the other [15:39:35] RECOVERY - cp21 Varnish Backends on cp21 is OK: All 9 backends are healthy [15:39:45] RECOVERY - mw10 MediaWiki Rendering on mw10 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.522 second response time [15:39:58] dmehus: yes but drive by comments are pointless [15:40:02] paladox: thanks! [15:40:05] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:40:28] RhinosF1: strange that you can’t load icinga-new.miraheze.org (it loads fine for me) [15:40:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:40:41] MacFan4000: dns [15:41:13] RhinosF1, hrm? isn't that a bit of the pot calling the kettle black? :) [15:41:18] MacFan4000: I changed my dns servers to cloudflare from o2 [15:41:27] all the misc services are going behind cache proxies [15:41:34] at some point [15:41:36] Ah [15:41:42] it's ipv6 only [15:47:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:49:04] PROBLEM - cp20 Stunnel Http for mw8 on cp20 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 328 bytes in 0.027 second response time [15:49:11] PROBLEM - cp30 Varnish Backends on cp30 is CRITICAL: 5 backends are down. mw8 mw9 mw11 mw12 mw13 [15:49:16] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:49:28] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: HTTP CRITICAL - No data received from host [15:49:29] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:49:45] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.07, 6.40, 5.67 [15:49:49] paladox: RhinosF1 503s are here [15:49:52] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:49:53] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:49:56] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:49:59] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:50:06] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.88, 6.20, 5.86 [15:50:07] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:50:09] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.016 second response time [15:50:23] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:50:33] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:50:34] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:50:50] PROBLEM - cp31 Stunnel Http for mw10 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:50:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [15:50:59] RECOVERY - cp20 Stunnel Http for mw8 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 0.006 second response time [15:51:14] PROBLEM - mw11 MediaWiki Rendering on mw11 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [15:51:15] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.321 second response time [15:51:19] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.81, 6.48, 5.39 [15:51:29] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 0.094 second response time [15:51:39] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.10, 5.57, 5.46 [15:51:53] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 20514 bytes in 2.009 second response time [15:52:02] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.09, 6.63, 6.04 [15:52:06] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [15:52:07] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.309 second response time [15:52:23] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.009 second response time [15:52:29] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.308 second response time [15:52:33] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.340 second response time [15:52:50] RECOVERY - cp31 Stunnel Http for mw10 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.364 second response time [15:52:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [15:53:06] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [15:53:08] RECOVERY - cp30 Varnish Backends on cp30 is OK: All 9 backends are healthy [15:53:08] RECOVERY - mw11 MediaWiki Rendering on mw11 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.456 second response time [15:53:19] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.05, 5.75, 5.25 [15:54:02] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.78, 6.90, 6.21 [15:54:43] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-1/±0] 13https://git.io/JSTk2 [15:54:43] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.75, 3.18, 2.69 [15:54:45] [02miraheze/puppet] 07paladox 03d95fdd3 - test101 not test111 [15:54:56] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTkK [15:54:57] [02miraheze/puppet] 07paladox 03a48d2bf - test101 not test111 [15:55:05] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.088 second response time [15:55:38] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.307 second response time [15:56:02] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.31, 6.02, 5.97 [15:56:05] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.016 second response time [15:56:12] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.230 second response time [15:56:15] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.335 second response time [15:56:42] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.09, 2.89, 2.64 [15:57:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [15:58:29] think the 503s are because of high i/o on gluster [15:59:40] moving to the new data centre will hopefully fix a lot of issues as mw*are on ssd disks [16:01:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [16:02:40] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.82, 3.11, 2.77 [16:06:38] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.11, 3.26, 2.92 [16:11:41] paladox, ty for the update :) [16:14:02] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.64, 7.13, 6.31 [16:16:02] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.47, 6.81, 6.30 [16:18:02] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.92, 6.14, 6.12 [16:20:18] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.67, 5.57, 4.77 [16:20:50] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [16:20:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb [16:22:13] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.36, 4.90, 4.63 [16:22:49] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [16:22:53] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [16:24:31] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.49, 3.27, 2.97 [16:26:31] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 1.84, 2.90, 2.88 [16:31:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [16:32:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±8] 13https://git.io/JSTCe [16:32:39] [02miraheze/puppet] 07paladox 03657fa1b - Revert "Fix path (#2219)" [16:42:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [16:43:23] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.46, 3.66, 3.34 [16:45:22] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.17, 3.34, 3.25 [16:47:00] RhinosF1: all mw* up now. [16:47:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [16:48:07] test101 is up [16:50:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTRo [16:50:26] [02miraheze/puppet] 07paladox 038c9ec6c - Install mem101 and 121 [16:50:28] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [16:50:29] [02puppet] 07paladox opened pull request 03#2222: Install mem101 and 121 - 13https://git.io/JSTRK [16:50:37] paladox: ok cool [16:50:46] paladox: they should already be a PR [16:50:52] Bundled with jobchron [16:50:56] Feel free to close [16:50:59] Oh [16:51:43] [02puppet] 07paladox synchronize pull request 03#2179: site: add memcache + jobchron in new DC - 13https://git.io/JyXIS [16:51:51] [02puppet] 07paladox synchronize pull request 03#2179: site: add memcache + jobchron in new DC - 13https://git.io/JyXIS [16:51:58] [02puppet] 07paladox synchronize pull request 03#2179: site: add memcache + jobchron in new DC - 13https://git.io/JyXIS [16:52:09] [02puppet] 07paladox closed pull request 03#2179: site: add memcache + jobchron in new DC - 13https://git.io/JyXIS [16:52:10] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+3/-0/±1] 13https://git.io/JST03 [16:52:12] [02miraheze/puppet] 07RhinosF1 03cace148 - site: add memcache + jobchron in new DC (#2179) [16:52:16] [02puppet] 07paladox closed pull request 03#2222: Install mem101 and 121 - 13https://git.io/JSTRK [16:52:20] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [16:52:22] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [16:53:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [16:54:09] RhinosF1: mem setup now [16:54:24] paladox: thanks [16:56:42] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTuv [16:56:43] [02miraheze/puppet] 07paladox 032d33348 - grafana: allow setting ldap server [16:56:44] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [16:56:46] [02puppet] 07paladox opened pull request 03#2223: grafana: allow setting ldap server - 13https://git.io/JSTuJ [16:57:22] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTuW [16:57:24] [02miraheze/puppet] 07paladox 034d0056a - Update init.pp [16:57:25] [02puppet] 07paladox synchronize pull request 03#2223: grafana: allow setting ldap server - 13https://git.io/JSTuJ [16:57:37] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTu0 [16:57:38] [02miraheze/puppet] 07paladox 033f7ae9d - Update mon111.yaml [16:57:40] [02puppet] 07paladox synchronize pull request 03#2223: grafana: allow setting ldap server - 13https://git.io/JSTuJ [16:57:45] [02puppet] 07paladox closed pull request 03#2223: grafana: allow setting ldap server - 13https://git.io/JSTuJ [16:57:47] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [16:57:48] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [16:57:50] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±3] 13https://git.io/JSTuz [16:57:52] [02miraheze/puppet] 07paladox 03c181d1e - grafana: allow setting ldap server (#2223) [17:03:17] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-0/±0] 13https://git.io/JSTg2 [17:03:18] [02miraheze/puppet] 07paladox 03b8f9cfd - Install db111 [17:03:19] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:03:21] [02puppet] 07paladox opened pull request 03#2224: Install db111 - 13https://git.io/JSTgw [17:03:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [17:03:52] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTgS [17:03:54] [02miraheze/puppet] 07paladox 03a482ca9 - Update site.pp [17:03:55] [02puppet] 07paladox synchronize pull request 03#2224: Install db111 - 13https://git.io/JSTgw [17:04:03] [02puppet] 07paladox closed pull request 03#2224: Install db111 - 13https://git.io/JSTgw [17:04:04] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [17:04:06] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:04:07] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/JSTgd [17:04:09] [02miraheze/puppet] 07paladox 034caa5b2 - Install db111 (#2224) [17:07:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [17:12:31] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTws [17:12:33] [02miraheze/puppet] 07paladox 0356c1639 - Install db121 [17:12:34] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:12:36] [02puppet] 07paladox opened pull request 03#2225: Install db121 - 13https://git.io/JSTwn [17:13:01] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-0/±0] 13https://git.io/JSTwE [17:13:02] [02miraheze/puppet] 07paladox 03c77d7dd - Create db121.yaml [17:13:04] [02puppet] 07paladox synchronize pull request 03#2225: Install db121 - 13https://git.io/JSTwn [17:13:10] [02puppet] 07paladox closed pull request 03#2225: Install db121 - 13https://git.io/JSTwn [17:13:12] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:13:13] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/JSTwa [17:13:15] [02miraheze/puppet] 07paladox 03ffe94d2 - Install db121 (#2225) [17:13:16] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [17:14:20] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.141.75/cpweb [17:16:19] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [17:16:55] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTrN [17:16:57] [02miraheze/puppet] 07paladox 036642684 - Update jobchron121.yaml [17:20:47] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTK8 [17:20:48] [02miraheze/puppet] 07paladox 03b3d1531 - mediawiki: Make sure php is install in jobchron* [17:21:55] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTK7 [17:21:56] [02miraheze/puppet] 07paladox 03d9d45c7 - Update jobchron1.yaml [17:22:07] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTKA [17:22:08] [02miraheze/puppet] 07paladox 0373b2744 - Update jobchron121.yaml [17:25:42] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.26, 7.44, 6.57 [17:26:10] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTig [17:26:11] [02miraheze/puppet] 07paladox 038ec2077 - Update jobchron121.yaml [17:26:42] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2001:41d0:801:2000::1b80/cpweb [17:26:44] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [17:27:40] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.66, 6.79, 6.15 [17:28:02] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.26, 6.48, 5.94 [17:28:06] [02puppet] 07Universal-Omega synchronize pull request 03#2211: check_reverse_dns: use `dns.resolver.Resolver.resolve()` - 13https://git.io/JSvEh [17:28:46] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.373 second response time [17:29:09] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+1/-0/±0] 13https://git.io/JSTP6 [17:29:11] [02miraheze/puppet] 07paladox 030941982 - Install phab121 [17:29:12] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:29:13] [02puppet] 07paladox opened pull request 03#2226: Install phab121 - 13https://git.io/JSTPi [17:29:32] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.42, 6.75, 6.50 [17:29:35] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 4.83, 6.16, 6.00 [17:29:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTPF [17:29:41] [02miraheze/puppet] 07paladox 03d2a064d - Update site.pp [17:29:42] [02puppet] 07paladox synchronize pull request 03#2226: Install phab121 - 13https://git.io/JSTPi [17:29:50] [02puppet] 07paladox closed pull request 03#2226: Install phab121 - 13https://git.io/JSTPi [17:29:52] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:29:53] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [17:29:55] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±1] 13https://git.io/JSTPh [17:29:56] [02miraheze/puppet] 07paladox 03b602768 - Install phab121 (#2226) [17:30:40] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [17:31:03] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.76, 3.60, 3.24 [17:32:02] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.34, 6.39, 6.04 [17:32:19] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTXA [17:32:21] [02miraheze/puppet] 07paladox 03270c909 - We only need to install python3-pygments [17:32:46] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JST1U [17:32:48] [02miraheze/puppet] 07paladox 035ca5eba - Update phab121.yaml [17:33:02] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.04, 3.80, 3.36 [17:37:07] RhinosF1: https://github.com/miraheze/puppet/pull/2184#pullrequestreview-842349893 [17:37:07] [url] varnish: add new mw servers by RhinosF1 · Pull Request #2184 · miraheze/puppet · GitHub | github.com [17:37:39] ok : [RESOLVED] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [17:37:58] CosmicAlpha: no idea [17:39:00] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.85, 3.54, 3.41 [17:40:20] paladox: https://github.com/miraheze/puppet/pull/2184#pullrequestreview-842349893 does that need to be `::1` or is the 127.0.0.1 correct, as they are ipv6, I thought they may need to be `::1` [17:40:21] [url] varnish: add new mw servers by RhinosF1 · Pull Request #2184 · miraheze/puppet · GitHub | github.com [17:41:00] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.26, 4.01, 3.61 [17:41:05] 127.0.0.1 will work [17:41:16] because all servers have a ipv4 localhost address [17:41:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [17:42:03] paladox: alright, just thought I'd verify thanks! [17:42:11] paladox: why is like every icinga check failing [17:42:59] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.23, 3.43, 3.45 [17:45:55] because nrpe is only binding to ipv4 [17:46:57] Ah [17:46:58] PROBLEM - mon2 Current Load on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:47:30] PROBLEM - mon2 IRC Log Bot on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:47:36] PROBLEM - mon2 conntrack_table_size on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:47:40] PROBLEM - mon2 Disk Space on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:48:09] paladox: can you make sure it works so it's easy to see what's broken [17:48:24] PROBLEM - mon2 IRC Log Server Bot on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:48:39] PROBLEM - mon2 PowerDNS Recursor on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:48:39] PROBLEM - mon2 IRC RC Bot on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:48:40] PROBLEM - mon2 NTP time on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:48:43] PROBLEM - mon2 ferm_active on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:48:48] PROBLEM - mon2 Check correctness of the icinga configuration on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:49:04] PROBLEM - mon2 APT on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:49:06] PROBLEM - mon2 Puppet on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:49:10] that's me [17:49:23] PROBLEM - mon2 IRCEcho on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [17:49:30] RECOVERY - mon2 IRC Log Bot on mon2 is OK: PROCS OK: 1 process with args 'adminlogbot.py' [17:49:36] RECOVERY - mon2 conntrack_table_size on mon2 is OK: OK: nf_conntrack is 2 % full [17:49:40] RECOVERY - mon2 Disk Space on mon2 is OK: DISK OK - free space: / 18567 MB (19% inode=97%); [17:50:24] RECOVERY - mon2 IRC Log Server Bot on mon2 is OK: PROCS OK: 1 process with args 'irclogserverbot.py' [17:50:39] RECOVERY - mon2 IRC RC Bot on mon2 is OK: PROCS OK: 1 process with args 'ircrcbot.py' [17:50:39] RECOVERY - mon2 PowerDNS Recursor on mon2 is OK: DNS OK: 0.030 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [17:50:40] RECOVERY - mon2 NTP time on mon2 is OK: NTP OK: Offset -0.0002231895924 secs [17:50:43] RECOVERY - mon2 ferm_active on mon2 is OK: OK ferm input default policy is set [17:50:50] RECOVERY - mon2 Check correctness of the icinga configuration on mon2 is OK: Icinga configuration is correct [17:50:56] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.68, 3.72, 3.61 [17:51:06] RECOVERY - mon2 APT on mon2 is OK: APT OK: 7 packages available for upgrade (0 critical updates). [17:51:06] RECOVERY - mon2 Puppet on mon2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [17:51:23] RECOVERY - mon2 IRCEcho on mon2 is OK: PROCS OK: 1 process with args '/usr/local/bin/ircecho' [17:54:55] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.43, 3.87, 3.68 [17:56:48] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JST5l [17:56:49] [02miraheze/puppet] 07paladox 03d3dac01 - nrpe: Allow ipv6 only hosts to end to ipv6 [17:56:51] [02puppet] 07paladox created branch 03paladox-patch-1 - 13https://git.io/vbiAS [17:56:52] [02puppet] 07paladox opened pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:57:36] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JST51 [17:57:37] [02miraheze/puppet] 07paladox 0337eb3ce - Update monitoring.pp [17:57:39] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:57:57] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JST5F [17:57:59] [02miraheze/puppet] 07paladox 0398c1c02 - Update bast101.yaml [17:58:00] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:58:05] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JST5x [17:58:07] [02miraheze/puppet] 07paladox 03dcfb210 - Update bast121.yaml [17:58:08] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:58:15] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdf [17:58:17] [02miraheze/puppet] 07paladox 03cc548fd - Update cloud10.yaml [17:58:18] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:58:24] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdk [17:58:25] [02miraheze/puppet] 07paladox 03f39e6b8 - Update cloud11.yaml [17:58:27] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:58:32] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdm [17:58:34] [02miraheze/puppet] 07paladox 03f328339 - Update cloud12.yaml [17:58:35] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:58:45] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdZ [17:58:46] [02miraheze/puppet] 07paladox 0396babc6 - Update db101.yaml [17:58:48] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:58:53] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.72, 3.95, 3.77 [17:58:56] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdl [17:58:57] [02miraheze/puppet] 07paladox 031cd6c76 - Update db111.yaml [17:58:59] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:59:06] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTd0 [17:59:08] [02miraheze/puppet] 07paladox 030a69689 - Update db121.yaml [17:59:09] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:59:16] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTd2 [17:59:18] [02miraheze/puppet] 07paladox 032351cf8 - Update gluster101.yaml [17:59:19] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:59:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdr [17:59:27] [02miraheze/puppet] 07paladox 033092254 - Update gluster111.yaml [17:59:29] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:59:34] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdP [17:59:36] [02miraheze/puppet] 07paladox 032c71799 - Update gluster121.yaml [17:59:37] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:59:47] paladox: probably could have massively simplified that change [17:59:50] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-1 [+0/-0/±1] 13https://git.io/JSTdS [17:59:51] [02miraheze/puppet] 07paladox 03dfee180 - Update jobchron121.yaml [17:59:53] [02puppet] 07paladox synchronize pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [17:59:55] Why can't everything just use v6? [18:00:21] mon2 was failing when i made it ipv6 only [18:00:53] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.50, 4.07, 3.83 [18:01:03] it shouldn't be, it has a v6 address [18:01:13] It might just not be committed in DNS potentially [18:02:40] https://www.irccloud.com/pastebin/89GjdH4P/ [18:02:42] JohnLewis: it's in dns [18:02:51] JohnLewis: i guess i should remove monitoring the ipv4 address? [18:03:24] paladox: https://github.com/miraheze/puppet/blob/5ca5eba852b8e34237e9ffbdeccc49775cfc7103/hieradata/hosts/gluster101.yaml#L8 – I think that probably shouldn't be bastion.miraheze.org, as bastion.miraheze.org is an existing wiki, which now doesn't load. [18:03:24] [url] puppet/gluster101.yaml at 5ca5eba852b8e34237e9ffbdeccc49775cfc7103 · miraheze/puppet · GitHub | github.com [18:04:02] CosmicAlpha: john setup bastion.m.org [18:04:17] also JohnLewis should i remove https://github.com/miraheze/puppet/blob/master/modules/monitoring/manifests/hosts.pp#L19 [18:04:17] [url] puppet/hosts.pp at master · miraheze/puppet · GitHub | github.com [18:04:24] PROBLEM - mon2 IRC Log Server Bot on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:04:39] PROBLEM - mon2 PowerDNS Recursor on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:04:39] PROBLEM - mon2 IRC RC Bot on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:04:40] paladox: yeah, might as well [18:04:40] PROBLEM - mon2 NTP time on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:04:43] PROBLEM - mon2 ferm_active on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:04:44] thanks! [18:04:48] PROBLEM - mon2 Check correctness of the icinga configuration on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:04:58] JohnLewis: also see CosmicAlpha comment above [18:05:04] PROBLEM - mon2 APT on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:05:06] PROBLEM - mon2 Puppet on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:05:23] PROBLEM - mon2 IRCEcho on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:05:30] PROBLEM - mon2 IRC Log Bot on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:05:36] PROBLEM - mon2 conntrack_table_size on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:05:39] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTNO [18:05:41] PROBLEM - mon2 Disk Space on mon2 is CRITICAL: connect to address 51.195.236.249 port 5666: Connection refusedconnect to host 51.195.236.249 port 5666: Connection refused [18:05:41] [02miraheze/puppet] 07paladox 036dbeae5 - monitoring: Remove ipv4 address from host [18:06:11] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTNE [18:06:12] [02miraheze/puppet] 07paladox 03f2ce17a - nrpe: Bind to ipv6 address [18:06:56] [02miraheze/dns] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTNQ [18:06:58] [02miraheze/dns] 07JohnFLewis 03fd77cff - bastion.mh -> bast.mh [18:09:49] stopping ircecho temporarily because it's about to spam... [18:09:59] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSTAx [18:10:01] [02miraheze/mw-config] 07Universal-Omega 034822df8 - Add bast.miraheze.org to CreateWiki subdomain blacklist [18:11:04] miraheze/mw-config - Universal-Omega the build passed. [18:11:09] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.46, 3.78, 3.78 [18:11:11] RECOVERY - mon2 Disk Space on mon2 is OK: DISK OK - free space: / 18496 MB (19% inode=97%); [18:11:11] RECOVERY - mon2 APT on mon2 is OK: APT OK: 7 packages available for upgrade (0 critical updates). [18:11:14] RECOVERY - mon2 IRC Log Bot on mon2 is OK: PROCS OK: 1 process with args 'adminlogbot.py' [18:11:19] PROBLEM - db12 Disk Space on db12 is WARNING: DISK WARNING - free space: / 27936 MB (6% inode=98%); [18:11:20] RECOVERY - mon2 conntrack_table_size on mon2 is OK: OK: nf_conntrack is 1 % full [18:11:20] RECOVERY - mon2 Puppet on mon2 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [18:11:24] [02miraheze/puppet] 07JohnFLewis pushed 031 commit to 03master [+0/-0/±18] 13https://git.io/JSTxu [18:11:26] [02miraheze/puppet] 07JohnFLewis 0306f8641 - bastion.mh -> bast.mh [18:11:26] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 21.09, 19.64, 17.75 [18:11:28] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [18:11:30] RECOVERY - mon2 IRCEcho on mon2 is OK: PROCS OK: 1 process with args '/usr/local/bin/ircecho' [18:11:33] CosmicAlpha: thank you for the spot there :) [18:11:40] !log [@mw11] starting deploy of {'config': True} to ovlon [18:11:41] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:11:50] This could be fun [18:12:02] PROBLEM - gluster3 glusterd on gluster3 is CRITICAL: connect to address 51.195.236.217 port 5666: Connection refusedconnect to host 51.195.236.217 port 5666: Connection refused [18:12:03] PROBLEM - mw13 PowerDNS Recursor on mw13 is CRITICAL: connect to address 51.195.236.251 port 5666: Connection refusedconnect to host 51.195.236.251 port 5666: Connection refused [18:12:05] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: connect to address 51.195.236.217 port 5666: Connection refusedconnect to host 51.195.236.217 port 5666: Connection refused [18:12:07] PROBLEM - mw10 PowerDNS Recursor on mw10 is CRITICAL: connect to address 51.195.236.254 port 5666: Connection refusedconnect to host 51.195.236.254 port 5666: Connection refused [18:12:09] PROBLEM - mw13 JobRunner Service on mw13 is CRITICAL: connect to address 51.195.236.251 port 5666: Connection refusedconnect to host 51.195.236.251 port 5666: Connection refused [18:12:09] PROBLEM - mwtask1 APT on mwtask1 is CRITICAL: connect to address 198.244.181.23 port 5666: Connection refusedconnect to host 198.244.181.23 port 5666: Connection refused [18:12:09] PROBLEM - phab2 conntrack_table_size on phab2 is CRITICAL: connect to address 51.195.236.244 port 5666: Connection refusedconnect to host 51.195.236.244 port 5666: Connection refused [18:12:11] PROBLEM - cloud4 Current Load on cloud4 is CRITICAL: connect to address 51.68.206.138 port 5666: Connection refusedconnect to host 51.68.206.138 port 5666: Connection refused [18:12:12] PROBLEM - cloud4 Disk Space on cloud4 is CRITICAL: connect to address 51.68.206.138 port 5666: Connection refusedconnect to host 51.68.206.138 port 5666: Connection refused [18:12:12] PROBLEM - gluster3 Check Gluster Clients on gluster3 is CRITICAL: connect to address 51.195.236.217 port 5666: Connection refusedconnect to host 51.195.236.217 port 5666: Connection refused [18:12:13] PROBLEM - mw9 ferm_active on mw9 is CRITICAL: connect to address 51.195.236.222 port 5666: Connection refusedconnect to host 51.195.236.222 port 5666: Connection refused [18:12:16] PROBLEM - mw12 Puppet on mw12 is CRITICAL: connect to address 51.195.236.220 port 5666: Connection refusedconnect to host 51.195.236.220 port 5666: Connection refused [18:12:16] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:12:18] PROBLEM - mw13 Puppet on mw13 is CRITICAL: connect to address 51.195.236.251 port 5666: Connection refusedconnect to host 51.195.236.251 port 5666: Connection refused [18:12:18] PROBLEM - mw11 Current Load on mw11 is CRITICAL: connect to address 51.195.236.255 port 5666: Connection refusedconnect to host 51.195.236.255 port 5666: Connection refused [18:12:21] !log [@mw11] finished deploy of {'config': True} to ovlon - SUCCESS in 41s [18:12:23] PROBLEM - puppet3 ferm_active on puppet3 is CRITICAL: connect to address 51.195.236.216 port 5666: Connection refusedconnect to host 51.195.236.216 port 5666: Connection refused [18:12:24] RECOVERY - mon2 IRC Log Server Bot on mon2 is OK: PROCS OK: 1 process with args 'irclogserverbot.py' [18:12:26] PROBLEM - phab2 NTP time on phab2 is CRITICAL: connect to address 51.195.236.244 port 5666: Connection refusedconnect to host 51.195.236.244 port 5666: Connection refused [18:12:28] PROBLEM - mw13 NTP time on mw13 is CRITICAL: connect to address 51.195.236.251 port 5666: Connection refusedconnect to host 51.195.236.251 port 5666: Connection refused [18:12:33] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.47, 7.03, 6.39 [18:12:33] JohnLewis: no problem, thanks for fixing! [18:12:37] RECOVERY - cloud5 Current Load on cloud5 is OK: OK - load average: 18.95, 19.36, 17.80 [18:12:39] I wonder what will happen when scsvg triggers [18:12:41] RECOVERY - mon2 Check correctness of the icinga configuration on mon2 is OK: Icinga configuration is correct [18:12:43] RECOVERY - mon2 PowerDNS Recursor on mon2 is OK: DNS OK: 0.308 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:12:47] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:12:50] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [18:12:55] PROBLEM - mem1 memcached on mem1 is CRITICAL: connect to address 2001:41d0:800:178a::9 and port 11211: Connection refused [18:13:07] RECOVERY - mon2 ferm_active on mon2 is OK: OK ferm input default policy is set [18:13:09] RECOVERY - mon2 IRC RC Bot on mon2 is OK: PROCS OK: 1 process with args 'ircrcbot.py' [18:13:09] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.83, 4.03, 3.86 [18:13:11] RECOVERY - mon2 NTP time on mon2 is OK: NTP OK: Offset -0.003092557192 secs [18:13:15] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [18:13:16] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:13:27] PROBLEM - ns2 Auth DNS on ns2 is UNKNOWN: check_dns: Invalid hostname/address [18:13:35] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:13:43] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.24, 6.29, 5.78 [18:13:53] PROBLEM - mon2 grafana.miraheze.org HTTPS on mon2 is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [18:13:54] PROBLEM - mon2 icinga.miraheze.org HTTPS on mon2 is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [18:13:58] RECOVERY - gluster3 glusterd on gluster3 is OK: PROCS OK: 1 process with args '/usr/sbin/glusterd' [18:14:00] RECOVERY - mw13 PowerDNS Recursor on mw13 is OK: DNS OK: 0.351 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:14:01] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.52, 3.68, 3.60 [18:14:04] RECOVERY - mw10 PowerDNS Recursor on mw10 is OK: DNS OK: 0.401 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:14:04] RECOVERY - phab2 conntrack_table_size on phab2 is OK: OK: nf_conntrack is 0 % full [18:14:05] RECOVERY - mw13 JobRunner Service on mw13 is OK: PROCS OK: 1 process with args 'redisJobRunnerService' [18:14:05] PROBLEM - ns1 Auth DNS on ns1 is UNKNOWN: check_dns: Invalid hostname/address [18:14:06] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 18.95, 20.41, 19.11 [18:14:09] RECOVERY - mwtask1 APT on mwtask1 is OK: APT OK: 19 packages available for upgrade (0 critical updates). [18:14:10] RECOVERY - cloud4 Disk Space on cloud4 is OK: DISK OK - free space: / 1528195 MB (42% inode=99%); [18:14:11] RECOVERY - gluster3 Check Gluster Clients on gluster3 is OK: PROCS OK: 2 processes with args '/usr/sbin/glusterfs' [18:14:12] RECOVERY - mw12 Puppet on mw12 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [18:14:13] RECOVERY - mw9 ferm_active on mw9 is OK: OK ferm input default policy is set [18:14:14] RECOVERY - mw13 Puppet on mw13 is OK: OK: Puppet is currently enabled, last run 2 minutes ago with 0 failures [18:14:14] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 6.05, 6.18, 5.88 [18:14:17] RECOVERY - puppet3 ferm_active on puppet3 is OK: OK ferm input default policy is set [18:14:22] RECOVERY - mw13 NTP time on mw13 is OK: NTP OK: Offset -0.002220243216 secs [18:14:24] RECOVERY - phab2 NTP time on phab2 is OK: NTP OK: Offset 6.094574928e-05 secs [18:14:26] PROBLEM - mem2 memcached on mem2 is CRITICAL: connect to address 2001:41d0:800:1bbd::12 and port 11211: Connection refused [18:14:28] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 4.97, 6.39, 6.23 [18:14:44] PROBLEM - mail2 webmail.miraheze.org HTTPS on mail2 is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [18:14:48] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 20524 bytes in 0.173 second response time [18:15:09] PROBLEM - phab2 phab.miraheze.wiki HTTPS on phab2 is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [18:15:17] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 3.545 second response time [18:15:22] PROBLEM - mwtask1 MirahezeRenewSsl on mwtask1 is CRITICAL: connect to address 2001:41d0:800:1bbd::15 and port 5000: Connection refused [18:15:23] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [18:15:27] PROBLEM - phab2 phabricator.miraheze.org HTTPS on phab2 is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [18:15:38] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 3.624 second response time [18:15:43] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.97, 6.19, 5.81 [18:15:46] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.978 second response time [18:16:00] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 16.31, 18.92, 18.72 [18:17:06] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSThl [18:17:08] [02miraheze/puppet] 07paladox 03e59ff4b - monitoring: Use ipv6 for address [18:18:30] RECOVERY - mail2 webmail.miraheze.org HTTPS on mail2 is OK: HTTP OK: HTTP/1.1 200 OK - 6162 bytes in 0.104 second response time [18:19:12] RECOVERY - phab2 phab.miraheze.wiki HTTPS on phab2 is OK: HTTP OK: Status line output matched "HTTP/1.1 200" - 17692 bytes in 0.062 second response time [18:19:15] RECOVERY - ns2 Auth DNS on ns2 is OK: DNS OK: 0.263 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:19:22] RECOVERY - phab2 phabricator.miraheze.org HTTPS on phab2 is OK: HTTP OK: HTTP/1.1 200 OK - 18999 bytes in 0.091 second response time [18:19:45] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSkef [18:19:46] [02miraheze/mw-config] 07Universal-Omega 03c60b078 - Add phab.miraheze.org to CreateWiki subdomain blacklist [18:19:48] RECOVERY - mon2 icinga.miraheze.org HTTPS on mon2 is OK: HTTP OK: HTTP/1.1 302 Found - 308 bytes in 0.010 second response time [18:19:53] RECOVERY - mon2 grafana.miraheze.org HTTPS on mon2 is OK: HTTP OK: HTTP/1.1 200 OK - 35408 bytes in 0.214 second response time [18:19:55] RECOVERY - ns1 Auth DNS on ns1 is OK: DNS OK: 0.233 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::4c25,51.195.220.68 [18:20:49] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.62, 3.86, 3.91 [18:20:52] miraheze/mw-config - Universal-Omega the build passed. [18:21:14] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.46, 7.97, 7.28 [18:21:42] PROBLEM - cp30 Varnish Backends on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:21:47] PROBLEM - cp30 ferm_active on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:21:58] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:10] PROBLEM - cp30 conntrack_table_size on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:19] PROBLEM - cp30 Stunnel Http for mon2 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:24] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:28] PROBLEM - cp30 HTTP 4xx/5xx ERROR Rate on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:40] PROBLEM - cp30 Puppet on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:49] PROBLEM - cp30 Disk Space on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:49] PROBLEM - cp30 APT on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:57] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:22:59] PROBLEM - cp30 NTP time on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:23:00] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:23:00] PROBLEM - cp30 PowerDNS Recursor on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:23:08] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:23:14] PROBLEM - cp30 Stunnel Http for mw11 on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:23:29] PROBLEM - cp30 Current Load on cp30 is CRITICAL: connect to address 149.56.140.43 port 5666: Connection refusedconnect to host 149.56.140.43 port 5666: Connection refused [18:24:29] !log [@test3] starting deploy of {'config': True} to skip [18:24:30] !log [@test3] finished deploy of {'config': True} to skip - SUCCESS in 1s [18:24:51] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:25:04] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:25:57] PROBLEM - test3 php-fpm on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:25:59] PROBLEM - cp20 Stunnel Http for mw11 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:04] PROBLEM - cp20 NTP time on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:09] PROBLEM - cp20 Stunnel Http for mw8 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:11] PROBLEM - test3 Disk Space on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:26:16] PROBLEM - cp20 Disk Space on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:16] PROBLEM - cp20 Varnish Backends on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:30] PROBLEM - test3 PowerDNS Recursor on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:26:48] PROBLEM - cp20 PowerDNS Recursor on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:51] PROBLEM - test3 conntrack_table_size on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:26:52] PROBLEM - cp20 conntrack_table_size on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:26:58] PROBLEM - test3 NTP time on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:27:00] PROBLEM - cp20 Stunnel Http for mw10 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:01] PROBLEM - cp20 APT on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:04] PROBLEM - test3 Puppet on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:27:11] PROBLEM - test3 Check Gluster Clients on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:27:14] PROBLEM - cp20 Puppet on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:15] PROBLEM - test3 APT on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:27:16] PROBLEM - test3 Current Load on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:27:18] PROBLEM - cp20 HTTP 4xx/5xx ERROR Rate on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:27] PROBLEM - test3 ferm_active on test3 is CRITICAL: connect to address 51.195.236.247 port 5666: Connection refusedconnect to host 51.195.236.247 port 5666: Connection refused [18:27:27] PROBLEM - cp21 Stunnel Http for mon2 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:27:30] PROBLEM - cp21 ferm_active on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:27:36] PROBLEM - cp20 ferm_active on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:41] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:27:42] PROBLEM - cp21 Current Load on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:27:42] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:27:44] PROBLEM - cp20 Current Load on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:53] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:56] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:57] PROBLEM - cp20 Stunnel Http for mon2 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:27:57] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: connect to address 51.195.220.68 port 5666: Connection refusedconnect to host 51.195.220.68 port 5666: Connection refused [18:28:04] PROBLEM - cp21 Stunnel Http for mw11 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:05] PROBLEM - cp21 PowerDNS Recursor on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:10] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSkUa [18:28:12] [02miraheze/puppet] 07paladox 038ab9041 - nginx: Use ipv6 address for https monitoring [18:28:14] PROBLEM - cp21 conntrack_table_size on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:21] PROBLEM - cp21 Puppet on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:21] PROBLEM - cp21 HTTP 4xx/5xx ERROR Rate on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:24] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:39] PROBLEM - cp21 Disk Space on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:45] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:48] PROBLEM - cp21 Varnish Backends on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:51] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:28:54] PROBLEM - cp21 NTP time on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:29:02] PROBLEM - cp21 APT on cp21 is CRITICAL: connect to address 198.244.148.90 port 5666: Connection refusedconnect to host 198.244.148.90 port 5666: Connection refused [18:30:25] RECOVERY - test3 Check Gluster Clients on test3 is OK: PROCS OK: 1 process with args '/usr/sbin/glusterfs' [18:30:27] RECOVERY - test3 conntrack_table_size on test3 is OK: OK: nf_conntrack is 0 % full [18:30:29] RECOVERY - cp20 PowerDNS Recursor on cp20 is OK: DNS OK: 0.230 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:30:33] RECOVERY - test3 NTP time on test3 is OK: NTP OK: Offset 0.000230550766 secs [18:30:33] RECOVERY - cp20 Puppet on cp20 is OK: OK: Puppet is currently enabled, last run 5 minutes ago with 0 failures [18:30:40] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.355 second response time [18:30:41] RECOVERY - test3 Puppet on test3 is OK: OK: Puppet is currently enabled, last run 6 minutes ago with 0 failures [18:30:42] RECOVERY - cp20 conntrack_table_size on cp20 is OK: OK: nf_conntrack is 4 % full [18:30:43] RECOVERY - cp20 Stunnel Http for mw10 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.817 second response time [18:30:49] RECOVERY - cp21 NTP time on cp21 is OK: NTP OK: Offset -0.0002264976501 secs [18:30:52] RECOVERY - cp21 Varnish Backends on cp21 is OK: All 9 backends are healthy [18:30:54] RECOVERY - cp30 PowerDNS Recursor on cp30 is OK: DNS OK: 1.814 second response time. miraheze.org returns 149.56.140.43,149.56.141.75,2607:5300:201:3100::5ebc,2607:5300:201:3100::929a [18:30:55] RECOVERY - test3 ferm_active on test3 is OK: OK ferm input default policy is set [18:30:56] RECOVERY - cp21 Disk Space on cp21 is OK: DISK OK - free space: / 15301 MB (39% inode=97%); [18:30:57] RECOVERY - cp21 APT on cp21 is OK: APT OK: 1 packages available for upgrade (0 critical updates). [18:30:58] RECOVERY - cp20 APT on cp20 is OK: APT OK: 1 packages available for upgrade (0 critical updates). [18:30:59] RECOVERY - test3 Current Load on test3 is OK: OK - load average: 0.21, 0.19, 0.18 [18:31:00] RECOVERY - cp30 NTP time on cp30 is OK: NTP OK: Offset -0.005682379007 secs [18:31:01] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.006 second response time [18:31:05] RECOVERY - cp30 Puppet on cp30 is OK: OK: Puppet is currently enabled, last run 9 minutes ago with 0 failures [18:31:05] RECOVERY - cp30 Disk Space on cp30 is OK: DISK OK - free space: / 13233 MB (33% inode=97%); [18:31:06] RECOVERY - cp30 Stunnel Http for mw11 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.315 second response time [18:31:08] [02puppet] 07Universal-Omega opened pull request 03#2228: phab.miraheze.wiki: use mwtask cname - 13https://git.io/JSkT5 [18:31:09] RECOVERY - test3 APT on test3 is OK: APT OK: 19 packages available for upgrade (0 critical updates). [18:31:10] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.66, 6.08, 6.70 [18:31:12] RECOVERY - cp20 HTTP 4xx/5xx ERROR Rate on cp20 is OK: OK - NGINX Error Rate is 3% [18:31:13] RECOVERY - cp30 Current Load on cp30 is OK: OK - load average: 0.69, 0.72, 0.65 [18:31:14] RECOVERY - test3 PowerDNS Recursor on test3 is OK: DNS OK: 0.142 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:31:15] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.315 second response time [18:31:15] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.552 second response time [18:31:18] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.303 second response time [18:31:22] RECOVERY - cp30 APT on cp30 is OK: APT OK: 17 packages available for upgrade (0 critical updates). [18:31:23] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 0.019 second response time [18:31:26] RECOVERY - cp21 ferm_active on cp21 is OK: OK ferm input default policy is set [18:31:26] RECOVERY - cp21 Stunnel Http for mon2 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 35370 bytes in 0.027 second response time [18:31:29] RECOVERY - cp30 Varnish Backends on cp30 is OK: All 9 backends are healthy [18:31:29] RECOVERY - cp30 ferm_active on cp30 is OK: OK ferm input default policy is set [18:31:34] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.612 second response time [18:31:36] RECOVERY - cp20 Current Load on cp20 is OK: OK - load average: 1.17, 0.88, 0.79 [18:31:36] RECOVERY - cp20 ferm_active on cp20 is OK: OK ferm input default policy is set [18:31:40] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.315 second response time [18:31:41] RECOVERY - cp21 Current Load on cp21 is OK: OK - load average: 0.44, 0.56, 0.72 [18:31:41] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 0.007 second response time [18:31:41] RECOVERY - cp20 Stunnel Http for mw11 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.006 second response time [18:31:44] RECOVERY - test3 php-fpm on test3 is OK: PROCS OK: 13 processes with command name 'php-fpm7.4' [18:31:46] RECOVERY - cp20 NTP time on cp20 is OK: NTP OK: Offset -0.004351526499 secs [18:31:47] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.010 second response time [18:31:53] RECOVERY - cp21 Stunnel Http for mw11 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.008 second response time [18:31:55] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.014 second response time [18:31:55] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.012 second response time [18:31:55] RECOVERY - cp21 PowerDNS Recursor on cp21 is OK: DNS OK: 0.203 seconds response time. miraheze.org returns 198.244.148.90,2001:41d0:801:2000::1b80,2001:41d0:801:2000::4c25,51.195.220.68 [18:31:57] RECOVERY - cp20 Stunnel Http for mon2 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 35370 bytes in 0.041 second response time [18:32:01] RECOVERY - cp20 Varnish Backends on cp20 is OK: All 9 backends are healthy [18:32:03] RECOVERY - cp20 Disk Space on cp20 is OK: DISK OK - free space: / 17001 MB (43% inode=97%); [18:32:04] RECOVERY - cp30 conntrack_table_size on cp30 is OK: OK: nf_conntrack is 4 % full [18:32:05] RECOVERY - test3 Disk Space on test3 is OK: DISK OK - free space: / 7810 MB (43% inode=66%); [18:32:07] RECOVERY - cp20 Stunnel Http for mw8 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14554 bytes in 0.007 second response time [18:32:09] RECOVERY - cp21 conntrack_table_size on cp21 is OK: OK: nf_conntrack is 7 % full [18:32:13] RECOVERY - cp21 Puppet on cp21 is OK: OK: Puppet is currently enabled, last run 6 minutes ago with 0 failures [18:32:13] RECOVERY - cp21 HTTP 4xx/5xx ERROR Rate on cp21 is OK: OK - NGINX Error Rate is 3% [18:32:17] RECOVERY - cp30 Stunnel Http for mon2 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 35370 bytes in 0.424 second response time [18:32:18] RECOVERY - cp30 HTTP 4xx/5xx ERROR Rate on cp30 is OK: OK - NGINX Error Rate is 4% [18:32:21] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.348 second response time [18:34:41] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSkkd [18:34:42] [02miraheze/puppet] 07paladox 03079655e - Fix [18:35:03] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.06, 3.47, 3.50 [18:35:25] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSkIJ [18:35:26] [02miraheze/puppet] 07paladox 03eb1f1a3 - Revert "Fix" [18:37:13] !log starting full deploy-mediawiki to all SCSVG servers [18:37:20] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:39:19] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.99, 6.46, 5.67 [18:41:50] !log [@mw11] starting deploy of {'config': True} to ovlon [18:42:00] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:42:03] !log [@mw11] finished deploy of {'config': True} to ovlon - SUCCESS in 13s [18:42:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [18:42:47] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.16, 6.88, 6.34 [18:43:14] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.74, 3.70, 3.61 [18:43:16] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 6.61, 6.72, 5.96 [18:44:42] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.14, 7.14, 6.49 [18:46:37] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.55, 6.90, 6.48 [18:48:37] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.56, 6.40, 6.34 [18:50:43] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 22.18, 20.93, 19.09 [18:51:13] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.76, 3.15, 3.35 [18:51:42] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [18:52:41] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 17.00, 19.57, 18.82 [18:53:41] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.426 second response time [18:56:33] [02miraheze/puppet] 07paladox pushed 031 commit to 03paladox-patch-4 [+0/-0/±1] 13https://git.io/JSkmo [18:56:35] [02miraheze/puppet] 07paladox 03ba2ce9b - memcached: Allow to listen on ipv6 [18:56:36] [02puppet] 07paladox created branch 03paladox-patch-4 - 13https://git.io/vbiAS [18:56:38] [02puppet] 07paladox opened pull request 03#2229: memcached: Allow to listen on ipv6 - 13https://git.io/JSkmK [18:59:03] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.27, 3.82, 3.51 [19:00:50] Thanks paladox [19:01:08] I noticed that mirahezerenewssl is crashing on new infra too [19:01:14] I can't see logs though [19:01:20] I'm not fussed yet [19:01:32] well i cannot restart memcached so that'll be for the new memcached hosts [19:01:46] icinga will just have to fail for memcached for the old cluster (as it's ipv4 only) [19:01:56] Ok [19:02:07] That's no issue [19:02:17] Just add a comment or downtime [19:02:22] [02puppet] 07paladox closed pull request 03#2229: memcached: Allow to listen on ipv6 - 13https://git.io/JSkmK [19:02:23] [02puppet] 07paladox deleted branch 03paladox-patch-4 - 13https://git.io/vbiAS [19:02:25] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-4 [19:02:26] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSkOt [19:02:28] [02miraheze/puppet] 07paladox 0367a22ca - memcached: Allow to listen on ipv6 (#2229) [19:03:00] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.36, 3.47, 3.48 [19:03:04] Otherwise I'll forget [19:05:12] RhinosF1: mirahezerenewssl might be because https://github.com/miraheze/puppet/blob/67a22ca455b9987acca03341bd7f60ba2ce24fcc/modules/letsencrypt/files/mirahezerenewssl.py#L40 needs to be `::` instead of `0.0.0.0`? Though I'm not certain, `0.0.0.0` might work like `127.0.0.1` does for the others I guess. I'm not certain just an idea. [19:05:13] [url] puppet/mirahezerenewssl.py at 67a22ca455b9987acca03341bd7f60ba2ce24fcc · miraheze/puppet · GitHub | github.com [19:06:38] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSk3n [19:06:40] [02miraheze/puppet] 07paladox 033eb0abd - letsencrypt::web: Use ipv6 address for tcp monitoring check [19:06:56] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.49, 4.18, 3.75 [19:07:22] Wait no issue wouldn't be python script [19:07:42] CosmicAlpha: the script isn't starting so it's possible [19:08:26] https://www.irccloud.com/pastebin/rwpNMRob/ [19:10:52] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.18, 3.86, 3.72 [19:11:11] paladox: can you run mkdir [19:11:18] i have [19:11:21] and it's running now [19:11:36] i haven't synced the letsencrypt directory tho [19:11:43] think we should do some of that last [19:12:03] Yeah let's leave that until last [19:14:22] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.40, 6.56, 6.08 [19:14:52] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.51, 4.18, 3.89 [19:15:42] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.95, 6.39, 6.13 [19:16:15] [02puppet] 07Universal-Omega opened pull request 03#2230: mirahezerenewssl: use force=True for logging - 13https://git.io/JSks3 [19:16:20] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.42, 6.01, 5.96 [19:16:52] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.07, 3.97, 3.86 [19:17:43] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.33, 5.96, 6.00 [19:18:00] [02puppet] 07Universal-Omega edited pull request 03#2230: mirahezerenewssl: use force=True for logging - 13https://git.io/JSks3 [19:18:10] RhinosF1: ^ [19:18:52] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.26, 4.41, 4.03 [19:20:12] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.43, 18.34, 17.58 [19:21:16] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 9.63, 7.35, 6.51 [19:22:45] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.21, 3.68, 3.86 [19:24:06] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 20.03, 19.59, 18.29 [19:25:09] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 5.89, 6.98, 6.57 [19:26:37] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.14, 3.83, 3.87 [19:27:07] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::929a/cpweb [19:27:18] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:27:34] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.007 second response time [19:27:36] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:27:54] Cool [19:28:04] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:28:33] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.68, 3.88, 3.89 [19:28:40] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:28:49] there's a mwtask cname? CosmicAlpha RhinosF1 ? [19:28:54] Yes [19:29:03] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.39, 6.39, 6.45 [19:29:04] Yes [19:29:16] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.636 second response time [19:29:30] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.281 second response time [19:29:31] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.033 second response time [19:30:03] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.318 second response time [19:30:23] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:30:24] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:30:39] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.670 second response time [19:30:48] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.62, 6.75, 6.39 [19:30:49] [02puppet] 07paladox closed pull request 03#2228: phab.miraheze.wiki: use mwtask cname - 13https://git.io/JSkT5 [19:30:50] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JSknJ [19:30:52] [02miraheze/puppet] 07Universal-Omega 03ae6459f - phab.miraheze.wiki: use mwtask cname (#2228) [19:31:07] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:31:08] [02puppet] 07paladox closed pull request 03#2227: nrpe: Allow ipv6 only hosts to end to ipv6 - 13https://git.io/JST58 [19:31:10] [02puppet] 07paladox deleted branch 03paladox-patch-1 - 13https://git.io/vbiAS [19:31:11] [02miraheze/puppet] 07paladox deleted branch 03paladox-patch-1 [19:31:12] Thanks paladox! [19:31:37] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:32:26] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.18, 3.52, 3.71 [19:32:48] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.66, 6.62, 6.38 [19:33:05] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:33:13] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.572 second response time [19:33:39] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 8.110 second response time [19:34:22] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.26, 3.35, 3.63 [19:34:35] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 8.635 second response time [19:34:41] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.587 second response time [19:36:04] paladox: did you restart new memcache [19:40:56] we're still facing a strange issue with the results on Google [19:41:15] the main page is showing a dash and then the name of the wiki [19:41:23] like if it's a subpage [19:41:42] another page* [19:41:53] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 51.195.220.68/cpweb [19:42:00] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::5ebc/cpweb [19:42:05] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.69, 7.17, 6.27 [19:42:06] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 1.58, 2.49, 3.17 [19:43:53] PROBLEM - cp21 Stunnel Http for mw10 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:44:05] PROBLEM - cp30 Stunnel Http for mw10 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:44:37] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 7.32, 6.50, 5.84 [19:45:53] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:45:53] RECOVERY - cp21 Stunnel Http for mw10 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 1.499 second response time [19:45:58] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:46:03] RECOVERY - cp30 Stunnel Http for mw10 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 2.805 second response time [19:46:03] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.72, 7.38, 6.57 [19:46:34] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 6.16, 6.36, 5.87 [19:47:42] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.63, 6.98, 6.65 [19:48:03] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.06, 6.21, 6.25 [19:49:56] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb [19:49:56] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.50, 3.38, 3.30 [19:51:55] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:52:54] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:53:00] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:53:14] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:53:33] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:53:38] PROBLEM - cp21 Stunnel Http for mw12 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [19:54:44] ssh-agent: ^ [19:55:39] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.50, 7.42, 6.98 [19:55:52] RhinosF1: ah, I see. Thanks for letting me know [19:55:53] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [19:55:57] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.89, 3.35, 3.35 [19:55:59] Yeah, someone reported issues on #general already [19:56:13] ssh-agent: can you kick google [19:56:17] You under SEO [19:56:42] oh, that too :P [19:56:58] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 5.398 second response time [19:57:05] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 1.838 second response time [19:57:21] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 4.137 second response time [19:57:38] That's a known issue upstream, I know that Wikipedia was affected by it a few months ago and still is I think. I'll see what they did to fix it though but perhaps that's something they fixed in 1.38 [19:57:41] Yeah, I wasn't sure if the ping was to the more recent icinga alerts, or the Google SEO issue [19:57:48] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 8.278 second response time [19:57:52] RECOVERY - cp21 Stunnel Http for mw12 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 5.516 second response time [19:57:52] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [19:58:14] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.82, 6.35, 5.96 [20:00:11] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.20, 5.89, 5.83 [20:01:00] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.27, 6.98, 6.49 [20:01:37] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.85, 7.10, 7.03 [20:03:00] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.79, 6.58, 6.41 [20:03:37] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.52, 7.76, 7.29 [20:07:35] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.26, 7.41, 7.28 [20:09:33] Agent can't help with icinga alerts much [20:10:16] .in 90mins set npm proxy [20:10:16] RhinosF1: Okay, will remind at 2022-01-01 - 21:40:16GMT [20:10:44] s/much/at all/ [20:11:00] All I can do is yell at the bot [20:11:16] ssh-agent: it might help [20:11:50] They are quite a few catch 22s in setting up a new DC [20:12:06] We can't set mediawiki up without mediawiki [20:12:26] So it will always error [20:13:02] New DCs are fun because it means no DB [20:13:33] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.64, 7.42, 7.20 [20:13:36] paladox: mem121 is still alerting [20:13:45] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [20:13:51] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 7 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:14:09] It needs restart [20:15:33] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.32, 7.18, 7.13 [20:15:58] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03Universal-Omega-patch-1 [+0/-0/±1] 13https://git.io/JSkz8 [20:15:59] [02miraheze/mw-config] 07Universal-Omega 0342a8396 - Setup CreateWiki on betaheze [20:16:01] [02mw-config] 07Universal-Omega created branch 03Universal-Omega-patch-1 - 13https://git.io/vbvb3 [20:16:10] [02mw-config] 07Universal-Omega opened pull request 03#4326: Setup CreateWiki on betaheze - 13https://git.io/JSkzR [20:16:35] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.25, 3.74, 3.49 [20:17:07] miraheze/mw-config - Universal-Omega the build passed. [20:17:43] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:17:51] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:18:31] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.89, 3.38, 3.39 [20:21:41] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:21:51] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:22:35] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:23:45] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.23, 7.21, 6.65 [20:24:37] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.519 second response time [20:27:29] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.65, 7.59, 7.25 [20:29:18] RhinosF1: works now [20:29:28] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.98, 7.46, 7.23 [20:29:51] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:31:50] paladox: ty [20:32:28] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±0] 13https://git.io/JSkVQ [20:32:29] [02miraheze/puppet] 07paladox 03a95afac - Create es101.yaml [20:32:44] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±0] 13https://git.io/JSkVF [20:32:45] [02miraheze/puppet] 07paladox 03030b51a - Create es111.yaml [20:32:48] Can we have the irc bots [20:32:52] [02miraheze/puppet] 07paladox pushed 031 commit to 03master [+1/-0/±0] 13https://git.io/JSkVN [20:32:54] [02miraheze/puppet] 07paladox 035e09ed2 - Create es121.yaml [20:33:28] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.80, 7.73, 7.07 [20:33:51] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:35:25] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.02, 7.17, 6.96 [20:35:51] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:37:21] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.24, 6.65, 6.80 [20:37:34] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:43:24] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.18, 7.41, 7.12 [20:45:23] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.91, 6.95, 6.99 [20:50:41] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.07, 7.97, 6.96 [20:51:02] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.70, 6.87, 6.83 [20:51:17] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:51:28] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 149.56.141.75/cpweb [20:51:43] PROBLEM - cp20 Stunnel Http for mw12 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:51:48] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:51:50] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [20:51:51] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:51:55] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:52:10] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [20:52:26] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:52:29] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [20:54:29] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 0.319 second response time [20:54:56] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.94, 6.61, 6.74 [20:55:26] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [20:55:50] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [20:58:39] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.22, 7.54, 7.25 [20:59:24] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 1 datacenter is down: 149.56.141.75/cpweb [20:59:50] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 3 datacenters are down: 51.195.220.68/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb [21:00:47] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.994 second response time [21:01:04] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.69, 5.17, 4.48 [21:01:17] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 5.34, 3.95, 3.42 [21:01:17] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.72, 7.10, 6.93 [21:01:50] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:01:56] RECOVERY - cp20 Stunnel Http for mw12 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 3.264 second response time [21:02:34] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.88, 6.67, 6.04 [21:03:00] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.52, 5.18, 4.56 [21:03:22] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [21:04:30] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 6.00, 6.25, 5.95 [21:04:56] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.81, 4.78, 4.50 [21:05:15] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 6.38, 6.59, 6.76 [21:08:36] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.17, 5.85, 6.65 [21:12:53] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.14, 3.77, 3.70 [21:13:14] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.31, 7.59, 6.69 [21:14:33] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.27, 7.31, 7.00 [21:15:09] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.42, 7.09, 6.62 [21:16:32] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.55, 6.83, 6.85 [21:19:00] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.15, 6.16, 6.35 [21:19:48] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:19:50] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.257 second response time [21:19:54] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:20:53] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: HTTP CRITICAL: HTTP/1.1 502 Bad Gateway - 344 bytes in 0.933 second response time [21:20:54] PROBLEM - mw12 MediaWiki Rendering on mw12 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:20:59] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:21:10] PROBLEM - cp31 Stunnel Http for mw12 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:21:13] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 8 datacenters are down: 51.195.220.68/cpweb, 198.244.148.90/cpweb, 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [21:22:11] PROBLEM - cp31 Stunnel Http for mw8 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:22:22] PROBLEM - cp30 Stunnel Http for mw8 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:22:34] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.75, 2.92, 3.32 [21:22:49] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [21:22:51] RECOVERY - mw12 MediaWiki Rendering on mw12 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 0.387 second response time [21:22:53] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 4.065 second response time [21:23:00] PROBLEM - mw8 MediaWiki Rendering on mw8 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [21:23:02] PROBLEM - cp21 Stunnel Http for mw8 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [21:23:03] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 4.145 second response time [21:23:08] RECOVERY - cp31 Stunnel Http for mw12 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.334 second response time [21:23:13] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [21:23:48] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.572 second response time [21:23:56] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 1.190 second response time [21:23:59] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 20524 bytes in 0.889 second response time [21:24:11] RECOVERY - cp31 Stunnel Http for mw8 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.331 second response time [21:24:20] RECOVERY - cp30 Stunnel Http for mw8 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.331 second response time [21:24:54] RECOVERY - mw8 MediaWiki Rendering on mw8 is OK: HTTP OK: HTTP/1.1 200 OK - 20514 bytes in 0.450 second response time [21:24:57] RECOVERY - cp21 Stunnel Http for mw8 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14546 bytes in 0.013 second response time [21:25:08] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.93, 6.34, 5.99 [21:26:29] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.19, 7.49, 7.15 [21:26:43] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 7.61, 7.03, 6.65 [21:26:49] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [21:27:05] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 5.07, 5.92, 5.89 [21:28:28] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 5.03, 6.60, 6.87 [21:28:38] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.81, 6.77, 6.62 [21:29:23] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.95, 3.38, 3.37 [21:30:28] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.25, 7.05, 6.99 [21:30:59] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.37, 6.83, 6.50 [21:32:27] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.84, 7.30, 7.09 [21:32:56] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.90, 6.53, 6.43 [21:37:08] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.17, 3.76, 3.55 [21:39:04] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.64, 3.62, 3.51 [21:40:16] RhinosF1: set npm proxy [21:40:24] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.12, 6.16, 6.67 [21:41:39] alerting : [FIRING:1] (PHP-FPM Worker Usage High mediawiki) https://grafana.miraheze.org/d/dsHv5-4nz/mediawiki [21:44:22] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.18, 7.53, 7.13 [21:46:22] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 4.98, 6.50, 6.80 [21:54:32] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.53, 3.12, 3.37 [21:56:17] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.20, 6.85, 6.66 [21:58:17] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.93, 6.44, 6.53 [21:58:27] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.73, 3.97, 3.66 [22:00:23] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.52, 3.77, 3.62 [22:02:14] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.16, 7.08, 6.79 [22:04:14] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.68, 6.61, 6.66 [22:04:15] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.02, 3.61, 3.58 [22:10:52] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.83, 6.79, 6.29 [22:11:02] PROBLEM - mw9 Current Load on mw9 is CRITICAL: CRITICAL - load average: 8.38, 6.81, 6.33 [22:12:59] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.56, 6.22, 6.17 [22:13:56] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.22, 3.93, 3.86 [22:15:56] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.34, 4.00, 3.89 [22:16:50] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.68, 6.61, 6.43 [22:19:56] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.94, 3.77, 3.85 [22:20:07] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.80, 7.13, 6.79 [22:20:48] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 7.89, 7.17, 6.69 [22:21:56] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.25, 3.97, 3.91 [22:22:44] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 7.37, 6.58, 6.24 [22:22:48] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.07, 7.34, 6.80 [22:23:56] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.55, 3.66, 3.82 [22:24:23] [02puppet] 07Universal-Omega opened pull request 03#2231: ircrcbot: bind to IPV6 - 13https://git.io/JSkbY [22:24:30] paladox: ^ [22:24:47] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 6.90, 7.04, 6.75 [22:24:51] Not 100% sure it'll work but might. [22:25:07] yeh though i think it'll break the old cluster as that's using ipv4 [22:25:13] though that's not the cause for the failure [22:25:15] i found why [22:25:19] Oh. [22:25:23] some how it's trying to use cp20 ip??? [22:25:32] connect(7, {sa_family=AF_INET, sin_port=htons(6697), sin_addr=inet_addr("51.195.220.68")}, 16) = -1 ENETUNREACH (Network is unreachable) [22:26:46] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.26, 7.62, 7.01 [22:28:04] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.89, 6.70, 6.76 [22:28:34] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.49, 6.47, 6.38 [22:28:46] PROBLEM - mw12 Current Load on mw12 is WARNING: WARNING - load average: 5.88, 6.98, 6.85 [22:29:50] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:30:43] paladox: where would it get that from [22:30:45] RECOVERY - mw12 Current Load on mw12 is OK: OK - load average: 5.85, 6.54, 6.70 [22:30:52] i have no idea [22:31:27] i mean i had an issue where by curlling google tried to use miraheze cp... but i fixed that by switching the ipv4 address to ipv6 address in resolv.conf [22:31:39] so i don't see how this is still happening. [22:31:49] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 4.151 second response time [22:31:53] (i mean curling libera.chat works and without issue) [22:31:56] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 3.11, 2.93, 3.39 [22:32:35] Hang on [22:33:02] paladox: is something pointing at bastion. and not bast. [22:33:14] hmm [22:33:18] i can check [22:33:32] Everything should have been https://github.com/miraheze/puppet/commit/06f8641dae3b663b182679c303d4b6698ed179dd [22:33:33] [url] bastion.mh -> bast.mh · miraheze/puppet@06f8641 · GitHub | github.com [22:33:34] it shouldn't be using bast generally [22:33:43] For the proxy [22:33:48] https://github.com/miraheze/puppet/search?q=bastion.miraheze.org [22:33:48] [url] Search · bastion.miraheze.org · GitHub | github.com [22:33:49] nope [22:33:58] paladox: something manual? [22:34:08] Is there anything you've changed live [22:34:12] A cache? [22:34:29] oh [22:34:32] now it's doing: [22:34:32] connect(7, {sa_family=AF_INET, sin_port=htons(6697), sin_addr=inet_addr("82.96.96.60")}, 16) = -1 ENETUNREACH (Network is unreachable) [22:35:35] oh maybe it falls back to miraheze when it cannot resolve? [22:35:39] or something [22:35:45] paladox: I thought I told you to use ipv6 [22:35:50] That's a libera ip [22:36:04] Change the destination to the v6 rotation [22:36:13] I did... it's just resetted it's self to use irc.libera.org... [22:36:40] Ok [22:38:19] https://www.irccloud.com/pastebin/eWhMw0fX/ [22:38:26] RhinosF1: ^ [22:39:35] https://www.irccloud.com/pastebin/yE2oJo3T/ [22:40:42] paladox: why [22:40:43] why [22:40:44] why [22:40:48] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.90, 3.80, 3.56 [22:40:59] i don't know [22:41:20] paladox: please will make /var/www www-data owned on mw* [22:42:18] is it not already? Because i checked two and it's owned by www-data? [22:42:28] https://www.irccloud.com/pastebin/GfjE6oKQ/ [22:42:39] paladox: task [22:42:51] ok now that's helpful... [22:43:23] also test paladox [22:43:25] RhinosF1: done [22:43:57] done [22:44:39] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 3 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 2607:5300:201:3100::929a/cpweb [22:44:53] PROBLEM - cp30 Stunnel Http for mw12 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [22:45:04] $log set npm proxy to bast.mh.o on test & mwtask in SCSVG [22:45:29] paladox: hopefully that'll fix npm [22:45:37] ok [22:46:38] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [22:46:52] RECOVERY - cp30 Stunnel Http for mw12 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.332 second response time [22:46:59] screw npm [22:48:22] yeh it won't work because github is ipv4 only [22:48:27] and it wants to try and use ssh [22:48:35] you'll have to pre-package it [22:51:17] paladox: maybe not [22:52:20] oh... [22:52:23] i got it working [22:52:29] when i used the ipv6 address [22:52:31] directly [22:53:05] we can use environment variables to set http(s)_proxy i think [22:53:20] paladox: i used it directly and set the env on task [22:54:21] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.46, 3.86, 3.84 [22:55:30] paladox: re your second pastebin above that is what my PR should do, make it use AF_INET6 like that, if I did it right anyways, so it would get the IPV6 address not IPV4 one. [22:55:31] RhinosF1: oh?? That worked??? [22:55:49] CosmicAlpha: yeh but the problem is we carn't change the packages :( [22:56:17] PROBLEM - mon2 Current Load on mon2 is CRITICAL: CRITICAL - load average: 4.86, 4.17, 3.95 [22:56:47] paladox: no but i got new errors [22:56:50] what is 31.24.105.137 [22:58:09] oh that looks like our cloud ips [22:58:44] go die in a hole npm [22:59:36] i set bast101's ipv6 [22:59:37] paladox: what do you mean can't change packages? If you mean will break current, I think I have a solution for that. [22:59:52] CosmicAlpha: npm is the world's most evil software [22:59:58] i think i fixed i [23:00:02] !log [@test3] starting deploy of {'l10nupdate': True} to skip [23:00:03] i mean change what ever the package does [23:00:08] !log [@mw11] starting deploy of {'l10nupdate': True} to ovlon [23:00:09] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 2.61, 3.73, 3.85 [23:00:10] RhinosF1: how? [23:00:14] also is that bastions ip [23:00:21] paladox: yes [23:00:31] paladox: give me a minute [23:00:35] ok [23:00:44] i'm going to use ipv6 directly lol [23:01:01] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:01:15] paladox: yeah it cached my env variable [23:01:23] oh [23:01:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:01:51] oh wait i know [23:02:14] paladox: I still don't know what you mean. We aren't changing any packages, except the ircrcbot python script in puppet unless that's what you mean? [23:02:51] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.40, 6.62, 6.45 [23:03:25] take 50 [23:04:34] grr seems python3 doesn't like resolving ipv6 addresses [23:04:47] paladox: Invalid URL: http://:8080/[2604:180:f3::382].miraheze.org [23:04:51] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.12, 6.03, 6.25 [23:04:56] npm is messed up software [23:04:58] huh [23:05:15] where is it doing that [23:05:29] paladox: in my error log [23:05:34] from take 50 of this [23:07:18] ffs [23:07:28] i inputed ipv6 manually for ircecho... [23:07:51] paladox: i'm getting there [23:07:55] error by error [23:08:26] PROBLEM - mw13 Current Load on mw13 is CRITICAL: CRITICAL - load average: 8.46, 7.10, 6.27 [23:09:53] paladox: is npmjs allowed by the proxy [23:10:21] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.41, 6.83, 6.28 [23:10:25] I'm not sure [23:10:31] isn't it just using port 80/443 [23:10:48] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.39, 6.71, 6.47 [23:10:49] https://github.com/miraheze/puppet/blob/master/modules/squid/files/squid.conf#L9 are the ports we allow [23:10:50] [url] puppet/squid.conf at master · miraheze/puppet · GitHub | github.com [23:10:53] we can add more if needed [23:12:47] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 9.66, 8.02, 7.00 [23:14:01] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.66, 5.24, 4.46 [23:14:46] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 2001:41d0:801:2000::4c25/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::5ebc/cpweb [23:14:46] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.78, 7.96, 7.10 [23:15:23] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 2 datacenters are down: 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb [23:16:45] paladox: i'm now onto request to https://registry.npmjs.org/@prettier%2fplugin-xml failed, reason: getaddrinfo ENOTFOUND [2604:180:f3::382] [23:17:12] why is it using bacula2 ip [23:17:22] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [23:17:28] paladox: no idea [23:17:57] oh i dumb [23:18:14] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 5.52, 6.63, 6.47 [23:19:28] paladox: now request to https://codeload.github.com/femiwiki/OOUIFemiwikiTheme/tar.gz/master failed, reason: getaddrinfo ENOTFOUND [2a10:6740::6:101] [23:19:33] PROBLEM - mw10 Current Load on mw10 is WARNING: WARNING - load average: 6.85, 6.20, 5.80 [23:19:39] yeh github is ipv4 only [23:19:49] paladox: it's using the proxy [23:19:56] 2a10:6740::6:101 is bast [23:19:56] RECOVERY - mon2 Current Load on mon2 is OK: OK - load average: 2.95, 3.07, 3.37 [23:19:58] does the proxy work with port 22 [23:20:26] no [23:20:44] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.22, 6.62, 6.79 [23:21:20] PROBLEM - ns2 GDNSD Datacenters on ns2 is CRITICAL: CRITICAL - 4 datacenters are down: 51.195.220.68/cpweb, 149.56.140.43/cpweb, 149.56.141.75/cpweb, 2607:5300:201:3100::5ebc/cpweb [23:21:25] PROBLEM - cp21 Stunnel Http for mw9 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:21:26] PROBLEM - cp31 Stunnel Http for mw9 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:21:32] RECOVERY - mw10 Current Load on mw10 is OK: OK - load average: 5.70, 5.87, 5.72 [23:21:47] PROBLEM - mw9 MediaWiki Rendering on mw9 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:21:49] PROBLEM - cp20 Stunnel Http for mw9 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:22:23] PROBLEM - cp30 Stunnel Http for mw9 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:23:05] paladox: it's http https://github.com/femiwiki/FemiwikiSkin/blob/main/package.json#L13 [23:23:05] [url] FemiwikiSkin/package.json at main · femiwiki/FemiwikiSkin · GitHub | github.com [23:23:11] yeh [23:23:12] why is it forcing ssh [23:23:19] i have no idea, it shouldn't [23:23:26] RECOVERY - cp21 Stunnel Http for mw9 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 4.504 second response time [23:23:27] there is no reason it would fail [23:23:28] RECOVERY - cp31 Stunnel Http for mw9 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 4.388 second response time [23:23:34] it's using the proxy [23:23:46] RECOVERY - cp20 Stunnel Http for mw9 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14564 bytes in 3.185 second response time [23:23:48] RECOVERY - mw9 MediaWiki Rendering on mw9 is OK: HTTP OK: HTTP/1.1 200 OK - 20524 bytes in 3.693 second response time [23:24:29] RECOVERY - cp30 Stunnel Http for mw9 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14556 bytes in 5.748 second response time [23:24:52] PROBLEM - mw11 Current Load on mw11 is WARNING: WARNING - load average: 6.88, 6.70, 6.29 [23:25:18] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [23:25:31] PROBLEM - test3 APT on test3 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:25:38] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.14, 5.27, 4.92 [23:26:37] PROBLEM - mw9 Current Load on mw9 is WARNING: WARNING - load average: 6.98, 6.75, 6.39 [23:26:43] PROBLEM - test3 Current Load on test3 is CRITICAL: CRITICAL - load average: 6.85, 3.58, 1.78 [23:26:48] RhinosF1: we don't support port 22 in the proxy [23:27:12] PROBLEM - cp31 Stunnel Http for mw13 on cp31 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:27:18] PROBLEM - mw13 MediaWiki Rendering on mw13 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [23:27:20] PROBLEM - cp20 Stunnel Http for mw13 on cp20 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:27:24] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 22.83, 21.42, 19.06 [23:27:25] paladox: why would it use port 22 [23:27:30] PROBLEM - cp21 Stunnel Http for mw13 on cp21 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:27:34] RECOVERY - gluster3 Current Load on gluster3 is OK: OK - load average: 3.96, 4.82, 4.80 [23:27:38] because it's ssh.. [23:27:43] paladox: why [23:27:49] there is no reason [23:27:51] i don't know ask the user [23:27:57] they put it to use ssh [23:27:59] PROBLEM - mw13 Current Load on mw13 is WARNING: WARNING - load average: 6.91, 7.46, 6.94 [23:28:32] even if you did get it to work, you would need to login in [23:28:33] RECOVERY - mw9 Current Load on mw9 is OK: OK - load average: 5.56, 6.48, 6.35 [23:28:34] paladox: the errors don't talk about ssh [23:28:36] you carn't use it anonymously [23:28:45] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:28:57] the error is it cannot connect because github uses ipv4. [23:28:58] PROBLEM - cp30 Stunnel Http for mw13 on cp30 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 10 seconds. [23:29:13] it's using our proxy [23:29:22] PROBLEM - cloud5 Current Load on cloud5 is CRITICAL: CRITICAL - load average: 26.45, 23.26, 20.02 [23:29:55] RECOVERY - mw13 Current Load on mw13 is OK: OK - load average: 4.82, 6.68, 6.73 [23:29:56] JohnLewis: around? [23:30:44] RECOVERY - mw11 Current Load on mw11 is OK: OK - load average: 6.53, 6.80, 6.47 [23:31:20] PROBLEM - cloud5 Current Load on cloud5 is WARNING: WARNING - load average: 21.85, 22.98, 20.32 [23:31:43] oh hold on nvm [23:31:47] RECOVERY - test3 APT on test3 is OK: APT OK: 19 packages available for upgrade (0 critical updates). [23:32:43] RECOVERY - test3 Current Load on test3 is OK: OK - load average: 1.75, 3.21, 2.28 [23:32:45] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 5 datacenters are down: 2001:41d0:801:2000::4c25/cpweb, 2001:41d0:801:2000::1b80/cpweb, 149.56.140.43/cpweb, 2607:5300:201:3100::929a/cpweb, 2607:5300:201:3100::5ebc/cpweb [23:33:14] RECOVERY - cp30 Stunnel Http for mw13 on cp30 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 9.639 second response time [23:33:27] the bot won't work for fuck sake [23:33:30] but the other would [23:33:33] !log [@test3] finished deploy of {'l10nupdate': True} to skip - SUCCESS in 2010s [23:33:33] *one [23:33:53] RECOVERY - cp21 Stunnel Http for mw13 on cp21 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 8.469 second response time [23:33:58] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:34:04] !log [@mw11] finished deploy of {'l10nupdate': True} to ovlon - SUCCESS in 2035s [23:34:22] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [23:34:24] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 5.73, 5.54, 5.14 [23:35:26] RECOVERY - mw13 MediaWiki Rendering on mw13 is OK: HTTP OK: HTTP/1.1 200 OK - 20526 bytes in 2.299 second response time [23:35:33] RECOVERY - cp31 Stunnel Http for mw13 on cp31 is OK: HTTP OK: HTTP/1.1 200 OK - 14565 bytes in 0.318 second response time [23:35:36] RECOVERY - cp20 Stunnel Http for mw13 on cp20 is OK: HTTP OK: HTTP/1.1 200 OK - 14557 bytes in 0.017 second response time [23:36:20] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.94, 6.01, 5.35 [23:36:45] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:37:15] RECOVERY - cloud5 Current Load on cloud5 is OK: OK - load average: 17.29, 19.57, 19.77 [23:38:45] paladox: what bot? [23:38:51] ircecho [23:39:12] paladox: what about the proxy for npm [23:39:26] you said hold on then nothing after [23:39:40] i then said nvm [23:39:54] because i realised i was wrong [23:39:59] that it was http not using ssh [23:40:09] yes [23:40:12] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.55, 5.28, 5.21 [23:42:08] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 7.85, 6.34, 5.60 [23:44:45] PROBLEM - ns1 GDNSD Datacenters on ns1 is CRITICAL: CRITICAL - 1 datacenter is down: 2607:5300:201:3100::5ebc/cpweb [23:46:00] PROBLEM - gluster3 Current Load on gluster3 is WARNING: WARNING - load average: 4.67, 5.77, 5.57 [23:46:34] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 7.58, 6.52, 6.25 [23:46:45] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [23:50:33] PROBLEM - mw8 Current Load on mw8 is CRITICAL: CRITICAL - load average: 8.19, 7.30, 6.61 [23:51:18] PROBLEM - cloud4 Current Load on cloud4 is WARNING: WARNING - load average: 20.83, 20.92, 19.92 [23:52:32] PROBLEM - mw8 Current Load on mw8 is WARNING: WARNING - load average: 6.24, 6.94, 6.57 [23:53:13] RECOVERY - cloud4 Current Load on cloud4 is OK: OK - load average: 17.61, 20.14, 19.79 [23:54:29] PROBLEM - mon2 Current Load on mon2 is WARNING: WARNING - load average: 3.55, 3.53, 3.32 [23:54:31] RECOVERY - mw8 Current Load on mw8 is OK: OK - load average: 5.45, 6.47, 6.44 [23:54:38] [02puppet] 07Universal-Omega synchronize pull request 03#2231: ircrcbot: bind to IPV6 - 13https://git.io/JSkbY [23:54:45] [02puppet] 07Universal-Omega synchronize pull request 03#2231: ircrcbot: bind to IPV6 - 13https://git.io/JSkbY [23:55:04] bind(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0 [23:55:10] [02puppet] 07Universal-Omega edited pull request 03#2231: ircrcbot: support binding to IPV6 - 13https://git.io/JSkbY [23:55:40] PROBLEM - gluster3 Current Load on gluster3 is CRITICAL: CRITICAL - load average: 6.47, 5.66, 5.55 [23:56:14] PROBLEM - mw12 Current Load on mw12 is CRITICAL: CRITICAL - load average: 8.04, 7.22, 6.78 [23:57:06] [02puppet] 07Universal-Omega synchronize pull request 03#2231: ircrcbot: support binding to IPV6 - 13https://git.io/JSkbY [23:59:18] paladox: if that PR will work at all it should support both infrastructures now, provided I got that right. If not feel free to close I guess. [23:59:32] thanks! [23:59:45] I'm trying to figure out why ircecho won't work with the ipv6 address [23:59:50] would you have any ideas?