[00:00:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:00:29] PROBLEM - cp41 Varnish Backends on cp41 is CRITICAL: 1 backends are down. mw181 [00:04:25] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 5.307 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:06:40] PROBLEM - cp26 Varnish Backends on cp26 is CRITICAL: 1 backends are down. mw181 [00:07:26] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:08:15] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [00:08:31] PROBLEM - cp51 Varnish Backends on cp51 is CRITICAL: 1 backends are down. mw181 [00:08:40] RECOVERY - cp26 Varnish Backends on cp26 is OK: All 19 backends are healthy [00:10:20] RECOVERY - cp41 Varnish Backends on cp41 is OK: All 19 backends are healthy [00:11:20] PROBLEM - cp37 Varnish Backends on cp37 is CRITICAL: 3 backends are down. mw151 mw161 mw162 [00:12:08] PROBLEM - cp36 Varnish Backends on cp36 is CRITICAL: 2 backends are down. mw151 mw161 [00:12:26] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:13:22] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 22 minutes ago with 0 failures [00:13:35] PROBLEM - cp27 Varnish Backends on cp27 is CRITICAL: 1 backends are down. mw181 [00:13:56] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:14:15] PROBLEM - cp41 Varnish Backends on cp41 is CRITICAL: 2 backends are down. mw171 mw182 [00:14:27] RECOVERY - cp51 Varnish Backends on cp51 is OK: All 19 backends are healthy [00:15:16] RECOVERY - cp37 Varnish Backends on cp37 is OK: All 19 backends are healthy [00:15:34] RECOVERY - cp27 Varnish Backends on cp27 is OK: All 19 backends are healthy [00:16:00] RECOVERY - cp36 Varnish Backends on cp36 is OK: All 19 backends are healthy [00:16:14] RECOVERY - cp41 Varnish Backends on cp41 is OK: All 19 backends are healthy [00:20:14] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [00:22:12] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.156 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:22:17] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 31 minutes ago with 0 failures [00:22:26] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:27:01] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [00:27:26] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [00:29:00] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [00:32:59] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 22.05, 22.34, 23.87 [00:39:50] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [00:41:47] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.221 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [00:50:59] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 15.29, 17.45, 20.06 [00:56:46] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:00:49] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:07:10] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:07:43] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:09:07] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.074 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:12:26] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:13:32] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:14:02] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [01:14:59] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 22.61, 20.98, 19.42 [01:16:59] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 19.80, 20.18, 19.30 [01:17:26] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:17:35] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.081 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:22:59] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.14, 22.80, 20.65 [01:24:59] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.90, 21.84, 20.55 [01:26:59] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.16, 21.93, 20.68 [01:27:03] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:28:59] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.073 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:28:59] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 22.90, 21.91, 20.80 [01:32:59] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 17.87, 19.99, 20.30 [01:34:02] [02puppet] 07The-Voidwalker closed pull request 03#3900: port some cloudflare rules to varnish nginx - 13https://github.com/miraheze/puppet/pull/3900 [01:34:04] [02puppet] 07The-Voidwalker pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/puppet/compare/25f7486bbe1d...5ee29f5218ec [01:34:05] [02puppet] 07The-Voidwalker 035ee29f5 - port some cloudflare rules to varnish nginx (#3900) [01:34:07] [02puppet] 07The-Voidwalker deleted branch 03The-Voidwalker-patch-1 - 13https://github.com/miraheze/puppet [01:34:09] [02puppet] 07The-Voidwalker deleted branch 03The-Voidwalker-patch-1 [01:37:41] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:38:28] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [01:39:37] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.085 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:40:10] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 14.48, 18.64, 23.90 [01:40:13] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 10.90, 17.84, 23.08 [01:40:13] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 11.44, 18.90, 23.66 [01:40:22] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 12.06, 17.27, 22.90 [01:41:10] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 12.43, 17.91, 23.35 [01:41:35] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 11.80, 16.89, 22.67 [01:42:35] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [01:44:22] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 11.53, 13.54, 20.04 [01:46:13] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 14.63, 15.08, 20.00 [01:47:08] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [01:47:35] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 14.52, 14.90, 19.81 [01:48:10] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 15.31, 15.65, 20.12 [01:48:13] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 11.09, 13.98, 19.27 [01:49:05] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.064 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [01:51:10] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.61, 16.52, 19.92 [01:52:26] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:04:10] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.04, 3.27, 3.93 [02:07:26] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:08:12] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.85, 3.85, 3.98 [02:09:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 14.31, 19.97, 23.45 [02:10:11] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.72, 3.15, 3.69 [02:13:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.90, 22.41, 23.63 [02:16:10] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.18, 3.51, 3.66 [02:18:38] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:20:45] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 9.742 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:22:26] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:22:28] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:25:10] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:27:07] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.076 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:27:24] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 49 seconds ago with 0 failures [02:27:26] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:30:10] PROBLEM - mw172 Current Load on mw172 is CRITICAL: LOAD CRITICAL - total load average: 27.25, 19.79, 16.11 [02:30:13] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 27.96, 19.85, 15.76 [02:30:13] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 27.53, 19.94, 16.24 [02:30:59] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.50, 18.62, 14.19 [02:32:04] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 29.31, 23.75, 19.38 [02:32:13] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 23.47, 21.11, 16.74 [02:32:22] PROBLEM - mw161 Current Load on mw161 is CRITICAL: LOAD CRITICAL - total load average: 24.69, 20.57, 16.19 [02:32:30] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 27.90, 22.53, 17.60 [02:32:59] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 14.12, 16.94, 14.12 [02:34:13] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 18.14, 19.72, 16.76 [02:34:13] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 20.77, 20.29, 17.20 [02:34:22] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 21.03, 19.83, 16.41 [02:36:13] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 25.74, 22.04, 18.20 [02:36:22] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 16.58, 18.90, 16.52 [02:36:27] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 17.26, 21.17, 18.23 [02:38:10] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 19.05, 21.57, 18.82 [02:38:13] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 23.01, 22.67, 18.93 [02:40:13] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 23.60, 21.84, 18.57 [02:40:24] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.11, 22.95, 19.58 [02:42:22] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 20.17, 22.23, 19.74 [02:42:26] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:44:10] PROBLEM - mw172 Current Load on mw172 is CRITICAL: LOAD CRITICAL - total load average: 24.71, 22.47, 20.05 [02:44:13] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.35, 23.15, 20.35 [02:44:21] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 24.10, 23.39, 20.50 [02:45:15] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 23.37, 21.54, 18.76 [02:46:10] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 21.26, 22.01, 20.18 [02:46:13] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.11, 22.39, 19.80 [02:46:19] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 22.07, 22.53, 20.51 [02:48:10] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 17.97, 20.20, 19.72 [02:48:13] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 18.96, 20.90, 19.56 [02:48:13] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 22.34, 23.04, 20.99 [02:49:11] RECOVERY - mw161 Current Load on mw161 is OK: LOAD OK - total load average: 17.68, 19.95, 18.75 [02:50:13] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 24.76, 22.48, 20.30 [02:52:10] PROBLEM - mw172 Current Load on mw172 is WARNING: LOAD WARNING - total load average: 17.70, 21.03, 20.36 [02:52:13] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 21.37, 22.46, 20.59 [02:52:14] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 15.69, 20.38, 20.37 [02:52:26] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:54:10] RECOVERY - mw172 Current Load on mw172 is OK: LOAD OK - total load average: 20.13, 20.34, 20.16 [02:54:13] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 17.66, 20.34, 20.02 [02:55:39] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.64, 22.42, 23.39 [02:56:13] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 17.58, 18.95, 20.02 [03:07:27] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 11.18, 14.92, 19.25 [03:08:10] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.59, 3.01, 3.82 [03:09:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.76, 20.52, 23.90 [03:12:11] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.67, 3.51, 3.83 [03:12:26] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:14:10] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.96, 3.00, 3.59 [03:18:11] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.98, 2.65, 3.31 [03:21:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.76, 21.71, 21.94 [03:28:11] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.78, 4.90, 4.01 [03:31:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.10, 23.44, 23.33 [03:32:11] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.63, 3.56, 3.71 [03:34:10] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.04, 4.05, 3.88 [03:38:34] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [03:40:30] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.073 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [03:46:23] PROBLEM - prometheus151 APT on prometheus151 is WARNING: APT WARNING: 0 packages available for upgrade (0 critical updates). warnings detected, errors detected. [03:49:19] PROBLEM - prometheus151 APT on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [03:49:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:49:35] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 18.49, 18.62, 20.12 [03:51:04] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:53:02] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [03:53:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.26, 20.32, 20.59 [03:54:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:59:35] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 13.91, 17.73, 19.59 [04:04:10] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.69, 3.11, 3.81 [04:08:10] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 3.35, 2.58, 3.38 [04:12:11] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.17, 4.72, 4.06 [04:12:30] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.40, 20.93, 19.77 [04:12:50] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [04:13:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:14:29] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.39, 22.51, 20.48 [04:14:49] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 22 minutes ago with 0 failures [04:18:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:18:35] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:20:25] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.97, 23.15, 21.73 [04:20:39] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 8.227 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:27:45] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:30:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:31:52] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:36:16] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 15.73, 18.62, 20.10 [04:36:31] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [04:38:27] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.062 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [04:40:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:42:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:42:33] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [04:44:10] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.92, 20.66, 20.05 [04:46:24] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [04:46:40] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [04:47:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:48:07] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.90, 22.21, 20.95 [04:52:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:54:04] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 20.17, 20.12, 20.35 [04:59:59] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.84, 22.24, 21.07 [05:02:11] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.01, 3.29, 3.96 [05:04:11] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 7.35, 4.67, 4.37 [05:05:25] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:05:55] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.68, 22.73, 21.98 [05:07:22] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.098 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:07:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:07:54] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.51, 23.74, 22.45 [05:11:51] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.47, 23.21, 22.56 [05:13:50] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.49, 24.91, 23.29 [05:14:49] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [05:15:49] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.87, 23.58, 23.01 [05:17:48] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.25, 24.01, 23.20 [05:23:20] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:23:45] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.90, 23.14, 23.39 [05:25:16] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.070 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:27:42] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.40, 23.98, 23.65 [05:29:41] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.07, 23.48, 23.53 [05:31:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:31:40] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 23.88, 24.25, 23.84 [05:33:38] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.36, 23.00, 23.45 [05:41:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:42:11] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.31, 3.17, 3.78 [05:46:12] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.43, 3.54, 3.76 [05:47:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:48:54] PROBLEM - archive.a2b2.org - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [05:51:00] PROBLEM - archive.a2b2.org - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for archive.a2b2.org could not be found [05:51:35] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 16.05, 18.40, 20.09 [05:52:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:55:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.32, 21.56, 20.99 [05:57:38] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:02:34] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 6 minutes ago with 0 failures [06:05:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.53, 22.42, 21.38 [06:06:16] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:07:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:08:13] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.069 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:12:25] [Grafana] RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:25:10] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.34, 19.93, 17.65 [06:27:10] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.21, 18.19, 17.28 [06:28:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:30:13] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:31:58] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [06:33:57] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 12 minutes ago with 0 failures [06:34:15] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.110 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:35:29] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - franchise.franchising.org.ua All nameservers failed to answer the query. [06:38:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:39:04] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:49:04] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:54:04] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:54:09] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:54:35] [02mw-config] 07BlankEclair opened pull request 03#5646: T12515: Reduce file upload limit to 2 MB for irisstationwiki - 13https://github.com/miraheze/mw-config/pull/5646 [06:55:37] miraheze/mw-config - BlankEclair the build passed. [06:58:12] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.233 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:59:04] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:59:46] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:02:37] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:03:53] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [07:04:04] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:04:33] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.091 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:05:07] PROBLEM - franchise.franchising.org.ua - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for franchise.franchising.org.ua could not be found [07:12:52] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:13:25] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:14:48] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.062 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:15:30] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [07:16:47] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [07:19:04] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:21:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:23:38] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:25:34] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.073 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:26:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:32:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:33:11] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [07:35:19] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [07:35:50] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:37:14] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.077 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [07:37:19] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 22 minutes ago with 0 failures [07:37:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:41:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.18, 21.96, 23.54 [07:42:05] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [07:43:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.62, 23.10, 23.74 [07:45:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.26, 22.28, 23.38 [07:45:55] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [07:49:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.72, 23.77, 23.77 [07:50:27] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [07:51:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.14, 22.47, 23.32 [07:53:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.15, 24.15, 23.85 [07:53:42] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:55:10] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [07:56:33] [02mw-config] 07Universal-Omega closed pull request 03#5646: T12515: Reduce file upload limit to 2 MB for irisstationwiki - 13https://github.com/miraheze/mw-config/pull/5646 [07:56:36] [02mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/mw-config/compare/f7a3816f003e...baa4202c3c7d [07:56:37] [02mw-config] 07BlankEclair 03baa4202 - T12515: Reduce file upload limit to 2 MB for irisstationwiki (#5646) [07:56:42] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [07:57:09] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 12 minutes ago with 0 failures [07:57:30] miraheze/mw-config - Universal-Omega the build passed. [07:58:42] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:59:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.35, 23.87, 23.91 [08:00:54] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:01:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 27.21, 24.84, 24.24 [08:02:51] !log [@mwtask181] starting deploy of {'config': True} to all [08:02:57] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:03:04] !log [@mwtask181] finished deploy of {'config': True} to all - SUCCESS in 13s [08:03:12] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:03:20] [Grafana] FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:05:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.34, 23.28, 23.94 [08:07:03] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.124 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:07:13] !log [@mwtask171] starting deploy of {'config': True} to all [08:07:18] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:07:22] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 9s [08:07:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:15:37] PROBLEM - ping6 on cp26 is CRITICAL: PING CRITICAL - Packet loss = 60%, RTA = 258.83 ms [08:19:45] RECOVERY - ping6 on cp26 is OK: PING OK - Packet loss = 0%, RTA = 197.18 ms [08:20:59] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:21:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.27, 21.96, 21.37 [08:22:57] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.098 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:23:20] [Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:23:40] !log [@test151] starting deploy of {'config': True} to test151 [08:23:41] !log [@test151] finished deploy of {'config': True} to test151 - SUCCESS in 0s [08:23:46] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:23:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:25:32] PROBLEM - cp26 Puppet on cp26 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [08:28:02] PROBLEM - ping6 on cp26 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 183.26 ms [08:30:04] RECOVERY - ping6 on cp26 is OK: PING OK - Packet loss = 0%, RTA = 190.60 ms [08:32:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:33:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.51, 22.55, 22.83 [08:34:18] PROBLEM - ping6 on cp26 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 181.36 ms [08:36:20] RECOVERY - ping6 on cp26 is OK: PING OK - Packet loss = 0%, RTA = 181.11 ms [08:38:26] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:42:25] [Grafana] FIRING: The mediawiki job queue has more than 1000 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:42:30] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.373 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:45:35] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.35, 22.57, 22.12 [08:45:39] PROBLEM - ping6 on cp26 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 181.26 ms [08:45:46] PROBLEM - prometheus151 Puppet on prometheus151 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [08:46:55] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:47:35] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.47, 22.72, 22.26 [08:47:41] RECOVERY - ping6 on cp26 is OK: PING OK - Packet loss = 0%, RTA = 179.53 ms [08:48:13] RECOVERY - prometheus151 Puppet on prometheus151 is OK: OK: Puppet is currently enabled, last run 25 minutes ago with 0 failures [08:48:51] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.093 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:52:50] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [08:53:17] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [08:53:32] RECOVERY - cp26 Puppet on cp26 is OK: OK: Puppet is currently enabled, last run 47 seconds ago with 0 failures [08:57:23] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 3.840 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [08:58:04] PROBLEM - ping6 on cp26 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 179.56 ms [08:58:21] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o