[00:00:23] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.65, 21.04, 20.37 [00:01:08] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 350 system event log (SEL) entries present] [00:01:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.44, 22.79, 23.76 [00:03:19] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 16.74, 20.09, 20.00 [00:03:33] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 21.27, 23.03, 20.98 [00:04:14] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.91, 19.61, 20.01 [00:06:12] [02python-functions] 07dependabot[bot] created branch 03dependabot/pip/dot-github/pytest-8.3.2 - 13https://github.com/miraheze/python-functions [00:06:15] [02python-functions] 07dependabot[bot] pushed 031 commit to 03dependabot/pip/dot-github/pytest-8.3.2 [+0/-0/±1] 13https://github.com/miraheze/python-functions/commit/59acaa3d99e2 [00:06:18] [02python-functions] 07dependabot[bot] 0359acaa3 - Bump pytest from 8.2.2 to 8.3.2 in /.github [00:06:21] [02python-functions] 07dependabot[bot] labeled pull request 03#49: Bump pytest from 8.2.2 to 8.3.2 in /.github - 13https://github.com/miraheze/python-functions/pull/49 [00:06:23] [02python-functions] 07dependabot[bot] labeled pull request 03#49: Bump pytest from 8.2.2 to 8.3.2 in /.github - 13https://github.com/miraheze/python-functions/pull/49 [00:06:25] [02python-functions] 07dependabot[bot] opened pull request 03#49: Bump pytest from 8.2.2 to 8.3.2 in /.github - 13https://github.com/miraheze/python-functions/pull/49 [00:06:28] [02python-functions] 07dependabot[bot] closed pull request 03#48: Bump pytest from 8.2.2 to 8.3.1 in /.github - 13https://github.com/miraheze/python-functions/pull/48 [00:06:29] [02python-functions] 07dependabot[bot] deleted branch 03dependabot/pip/dot-github/pytest-8.3.1 - 13https://github.com/miraheze/python-functions [00:06:30] [02python-functions] 07dependabot[bot] deleted branch 03dependabot/pip/dot-github/pytest-8.3.1 [00:06:33] [02python-functions] 07dependabot[bot] commented on pull request 03#48: Bump pytest from 8.2.2 to 8.3.1 in /.github - 13https://github.com/miraheze/python-functions/pull/48#issuecomment-2251604198 [00:06:36] [02python-functions] 07coderabbitai[bot] commented on pull request 03#49: Bump pytest from 8.2.2 to 8.3.2 in /.github - 13https://github.com/miraheze/python-functions/pull/49#issuecomment-2251604301 [00:08:07] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.57, 23.44, 21.47 [00:09:08] new alias [00:09:33] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 16.00, 19.07, 19.91 [00:10:02] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.06, 21.58, 21.00 [00:10:11] miraheze/python-functions - dependabot[bot] the build passed. [00:10:59] !log [void@mwtask181] starting deploy of {'versions': '1.42', 'upgrade_extensions': 'ManageWiki'} to all [00:11:04] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:11:18] !log [void@mwtask181] finished deploy of {'versions': '1.42', 'upgrade_extensions': 'ManageWiki'} to all - SUCCESS in 19s [00:11:23] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [00:11:29] Yes :) [00:16:04] good to see it [00:17:33] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 27.32, 22.93, 21.16 [00:17:44] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 23.29, 22.03 [00:19:33] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 17.16, 20.53, 20.50 [00:19:40] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.13, 21.55, 21.53 [00:20:52] PROBLEM - steamdecklinux.wiki - reverse DNS on sslhost is WARNING: rDNS WARNING - reverse DNS entry for steamdecklinux.wiki could not be found [00:21:33] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 17.74, 20.03, 20.34 [00:23:31] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.54, 22.96, 22.01 [00:25:26] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.49, 23.05, 22.18 [00:27:22] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.94, 24.48, 22.80 [00:28:50] PROBLEM - steamdecklinux.wiki - LetsEncrypt on sslhost is CRITICAL: Name or service not knownHTTP CRITICAL - Unable to open TCP socket [00:31:13] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.37, 23.81, 22.91 [00:33:33] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.26, 20.94, 19.23 [00:35:30] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.67, 21.93, 19.77 [00:37:26] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.27, 21.67, 19.97 [00:38:55] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.02, 23.06, 22.73 [00:40:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.03, 22.88, 22.72 [00:41:19] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 18.66, 19.67, 19.53 [00:44:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.73, 23.51, 22.88 [00:47:11] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.81, 20.99, 20.15 [00:48:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.71, 22.44, 22.60 [00:49:08] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 17.44, 19.64, 19.75 [00:50:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.69, 23.07, 22.79 [00:53:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.84, 22.08, 20.75 [00:54:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.07, 23.74, 23.14 [00:55:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.64, 24.57, 21.83 [00:59:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.60, 22.65, 21.87 [01:00:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.88, 24.42, 23.55 [01:04:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.18, 23.53, 23.56 [01:05:03] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 14.83, 17.31, 19.75 [01:06:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.74, 23.35, 23.44 [01:09:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.43, 20.06, 20.41 [01:11:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.54, 21.35, 20.79 [01:13:33] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 23.42, 22.11, 20.34 [01:19:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.93, 23.74, 22.56 [01:21:23] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [01:21:33] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 19.58, 20.07, 20.12 [01:23:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.39, 25.12, 23.26 [01:23:25] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 352 system event log (SEL) entries present] [01:24:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:28:57] PROBLEM - cp41 Varnish Backends on cp41 is CRITICAL: 1 backends are down. mw152 [01:29:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.90, 23.90, 23.57 [01:30:53] RECOVERY - cp41 Varnish Backends on cp41 is OK: All 19 backends are healthy [01:36:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 17.78, 21.41, 23.31 [01:37:04] PROBLEM - cp27 Varnish Backends on cp27 is CRITICAL: 1 backends are down. mw152 [01:37:31] PROBLEM - cp26 Varnish Backends on cp26 is CRITICAL: 1 backends are down. mw181 [01:39:03] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 9.90, 15.49, 19.93 [01:39:31] RECOVERY - cp26 Varnish Backends on cp26 is OK: All 19 backends are healthy [01:41:34] PROBLEM - cp41 Varnish Backends on cp41 is CRITICAL: 1 backends are down. mw152 [01:42:50] RECOVERY - cp27 Varnish Backends on cp27 is OK: All 19 backends are healthy [01:45:26] RECOVERY - cp41 Varnish Backends on cp41 is OK: All 19 backends are healthy [01:47:41] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 12.24, 16.95, 22.76 [01:48:59] PROBLEM - cp37 Varnish Backends on cp37 is CRITICAL: 1 backends are down. mw152 [01:50:55] RECOVERY - cp37 Varnish Backends on cp37 is OK: All 19 backends are healthy [01:52:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [01:53:41] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 28.62, 20.42, 21.89 [01:54:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.27, 21.63, 21.89 [01:58:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.74, 22.25, 22.18 [02:00:11] PROBLEM - db161 Current Load on db161 is CRITICAL: LOAD CRITICAL - total load average: 40.78, 17.33, 7.03 [02:02:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.64, 22.12, 22.01 [02:04:11] RECOVERY - db161 Current Load on db161 is OK: LOAD OK - total load average: 1.55, 8.47, 5.75 [02:10:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.27, 23.14, 22.97 [02:12:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.67, 23.50, 23.09 [02:16:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.23, 23.05, 23.14 [02:20:25] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.55, 22.59, 20.14 [02:20:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 29.03, 25.58, 24.08 [02:22:21] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 18.77, 21.54, 20.08 [02:22:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.90, 23.42, 23.44 [02:24:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.72, 24.46, 23.83 [02:28:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.70, 23.98, 23.80 [02:33:57] PROBLEM - db181 Current Load on db181 is CRITICAL: LOAD CRITICAL - total load average: 42.55, 20.32, 8.12 [02:34:04] PROBLEM - db181 PowerDNS Recursor on db181 is CRITICAL: CRITICAL - Plugin timed out while executing system call [02:36:46] PROBLEM - db181 Puppet on db181 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [02:37:53] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 16.08, 19.79, 20.17 [02:38:16] RECOVERY - db181 PowerDNS Recursor on db181 is OK: DNS OK: 0.252 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [02:38:46] RECOVERY - db181 Puppet on db181 is OK: OK: Puppet is currently enabled, last run 13 seconds ago with 0 failures [02:39:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 14.97, 19.65, 23.24 [02:41:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.36, 22.71, 23.91 [02:44:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.58, 17.37, 20.01 [02:47:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [02:53:48] PROBLEM - cp51 Varnish Backends on cp51 is CRITICAL: 1 backends are down. mw152 [02:53:57] PROBLEM - db181 Current Load on db181 is WARNING: LOAD WARNING - total load average: 0.28, 3.20, 10.66 [02:55:43] RECOVERY - cp51 Varnish Backends on cp51 is OK: All 19 backends are healthy [02:55:57] RECOVERY - db181 Current Load on db181 is OK: LOAD OK - total load average: 0.65, 2.36, 9.44 [03:00:19] RECOVERY - db151 Backups SQL on db151 is OK: FILE_AGE OK: /var/log/sql-backup.log is 18 seconds old and 0 bytes [03:00:57] PROBLEM - mw152 MediaWiki Rendering on mw152 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [03:05:14] RECOVERY - mw152 MediaWiki Rendering on mw152 is OK: HTTP OK: HTTP/1.1 200 OK - 8191 bytes in 8.148 second response time [03:14:55] PROBLEM - cp51 Varnish Backends on cp51 is CRITICAL: 1 backends are down. mw152 [03:16:50] RECOVERY - cp51 Varnish Backends on cp51 is OK: All 19 backends are healthy [03:17:54] PROBLEM - mason.sagan4.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'mason.sagan4.org' expires in 15 day(s) (Sun 11 Aug 2024 03:15:02 AM GMT +0000). [03:18:07] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/8edb1971d3f4...83dc64b49f32 [03:18:08] [02ssl] 07WikiTideSSLBot 0383dc64b - Bot: Update SSL cert for mason.sagan4.org [03:19:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 16.06, 21.68, 23.51 [03:22:15] PROBLEM - meta.sagan4.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'meta.sagan4.org' expires in 15 day(s) (Sun 11 Aug 2024 03:17:27 AM GMT +0000). [03:22:25] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/83dc64b49f32...2dcdfa59bb6f [03:22:26] [02ssl] 07WikiTideSSLBot 032dcdfa5 - Bot: Update SSL cert for meta.sagan4.org [03:29:11] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.04, 20.00, 18.32 [03:33:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 29.42, 24.33, 22.88 [03:36:53] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.33, 22.16, 19.92 [03:38:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.19, 23.10, 20.57 [03:39:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 12.87, 22.04, 22.90 [03:39:49] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [03:41:49] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 354 system event log (SEL) entries present] [03:42:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.46, 19.51, 19.64 [03:43:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.70, 23.06, 22.95 [03:45:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.88, 21.95, 22.55 [03:47:19] RECOVERY - mason.sagan4.org - LetsEncrypt on sslhost is OK: OK - Certificate 'mason.sagan4.org' will expire on Thu 24 Oct 2024 02:17:59 AM GMT +0000. [03:51:30] RECOVERY - meta.sagan4.org - LetsEncrypt on sslhost is OK: OK - Certificate 'meta.sagan4.org' will expire on Thu 24 Oct 2024 02:22:19 AM GMT +0000. [04:03:18] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.02, 20.51, 20.15 [04:03:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.27, 22.05, 21.82 [04:05:14] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.06, 19.89, 19.99 [04:16:23] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [04:17:34] [02CreateWiki] 07songnguxyz synchronize pull request 03#544: Translate special page alias to Vietnamese - 13https://github.com/miraheze/CreateWiki/pull/544 [04:17:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.72, 22.73, 21.13 [04:19:45] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.62, 21.84, 21.01 [04:25:32] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.78, 19.29, 20.21 [04:32:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [04:43:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.25, 22.99, 23.86 [04:44:38] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 48 seconds ago with 0 failures [04:45:57] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [04:47:57] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 356 system event log (SEL) entries present] [04:53:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.12, 22.49, 22.17 [04:55:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.42, 22.44, 22.26 [05:02:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:05:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.81, 21.93, 21.53 [05:07:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 22.47, 21.41, 21.37 [05:09:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.79, 23.47, 22.16 [05:11:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.14, 23.01, 22.14 [05:11:55] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.87, 4.49, 2.25 [05:12:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:15:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.16, 23.85, 22.69 [05:15:55] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.07, 3.53, 2.35 [05:16:34] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [05:17:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 18.17, 22.07, 22.20 [05:17:55] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.11, 2.87, 2.25 [05:25:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.08, 24.01, 22.87 [05:27:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.59, 22.87, 22.62 [05:30:40] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.88, 3.91, 2.95 [05:31:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.65, 23.44, 22.84 [05:32:35] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.19, 3.34, 2.87 [05:33:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.68, 21.89, 22.33 [05:39:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.55, 22.20, 22.13 [05:40:16] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.72, 4.32, 3.35 [05:41:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.63, 21.18, 21.75 [05:45:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.11, 22.49, 22.14 [05:49:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 18.55, 21.42, 21.86 [05:49:55] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.00, 3.52, 3.56 [05:51:56] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.66, 2.98, 3.37 [05:52:07] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [05:54:08] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [Inlet Temp = Critical, 358 system event log (SEL) entries present] [05:55:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.68, 21.01, 21.06 [05:55:59] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 10.92, 6.59, 4.71 [05:57:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 19.80, 20.73, 20.98 [05:57:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [05:59:55] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 1.58, 3.73, 3.95 [06:01:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.71, 21.56, 21.05 [06:03:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 16.67, 19.80, 20.50 [06:05:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 28.06, 23.14, 21.64 [06:07:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.37, 22.89, 21.74 [06:07:55] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.75, 2.45, 3.24 [06:14:41] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 45 seconds ago with 0 failures [06:15:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.25, 22.02, 21.50 [06:17:03] [Grafana] !tech FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:19:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.86, 23.31, 22.28 [06:22:03] [Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:25:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 29.12, 24.51, 22.78 [06:26:22] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.41, 20.43, 18.52 [06:28:18] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 18.54, 19.79, 18.53 [06:28:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:29:20] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 27.99, 22.89, 19.71 [06:31:15] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.29, 21.77, 19.69 [06:33:11] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.26, 20.20, 19.31 [06:33:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:33:36] RECOVERY - kagaga.jp - reverse DNS on sslhost is OK: SSL OK - kagaga.jp reverse DNS resolves to cp37.wikitide.net - NS RECORDS OK [06:37:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 17.81, 22.66, 23.43 [06:48:16] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [06:48:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:49:19] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 10.24, 16.52, 20.36 [06:50:16] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [359 system event log (SEL) entries present] [06:52:51] RECOVERY - kagaga.jp - LetsEncrypt on sslhost is OK: OK - Certificate 'kagaga.jp' will expire on Tue 22 Oct 2024 06:13:40 PM GMT +0000. [06:53:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 15.40, 18.34, 20.43 [06:53:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [06:55:19] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 18.16, 18.27, 20.14 [06:57:08] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [06:57:34] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 6.17, 4.05, 3.00 [06:59:04] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.113 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [06:59:29] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.01, 3.14, 2.79 [07:03:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:05:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.33, 22.54, 21.31 [07:05:45] [02mw-config] 07BlankEclair opened pull request 03#5625: T12390: Support gemini:// for rainversewiki - 13https://github.com/miraheze/mw-config/pull/5625 [07:06:43] miraheze/mw-config - BlankEclair the build passed. [07:07:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 15.01, 20.04, 20.58 [07:09:19] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 13.17, 17.55, 19.59 [07:17:48] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [07:31:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.12, 21.39, 19.25 [07:35:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.29, 21.72, 19.97 [07:41:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.08, 23.59, 21.23 [07:42:50] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.55, 22.53, 19.24 [07:44:12] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 38 seconds ago with 0 failures [07:46:11] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 23.69, 18.51, 14.39 [07:46:28] PROBLEM - mw162 Current Load on mw162 is CRITICAL: LOAD CRITICAL - total load average: 24.36, 20.78, 16.57 [07:46:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.42, 19.55, 16.89 [07:48:11] RECOVERY - mw171 Current Load on mw171 is OK: LOAD OK - total load average: 14.76, 16.97, 14.33 [07:48:24] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 23.98, 21.84, 17.48 [07:48:40] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.69, 23.62, 20.96 [07:50:36] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 23.73, 24.21, 21.52 [07:51:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 18.44, 23.67, 22.96 [07:52:33] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.56, 22.91, 21.35 [07:52:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.78, 23.13, 19.27 [07:54:29] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 27.27, 24.62, 22.15 [07:56:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.85, 23.01, 20.26 [07:58:25] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [07:58:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.97, 24.40, 21.13 [07:59:19] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 13.56, 17.25, 20.38 [08:00:25] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [360 system event log (SEL) entries present] [08:00:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.49, 22.84, 20.98 [08:02:00] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 12.97, 18.02, 18.50 [08:02:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.67, 22.94, 21.19 [08:04:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.06, 21.83, 20.98 [08:05:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 28.47, 22.61, 21.54 [08:22:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 17.68, 19.66, 20.40 [08:26:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 18.16, 20.07, 20.50 [08:27:29] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.64, 22.14, 23.97 [08:28:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.97, 19.15, 20.16 [08:33:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [08:43:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.00, 22.59, 22.75 [08:45:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 23.65, 23.40, 23.06 [08:47:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.99, 23.41, 23.07 [08:53:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 18.46, 22.69, 23.06 [08:55:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 27.64, 24.48, 23.65 [08:59:04] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 21.15, 20.69, 19.54 [09:01:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [09:04:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 19.89, 20.33, 19.82 [09:08:16] PROBLEM - mw162 Current Load on mw162 is WARNING: LOAD WARNING - total load average: 21.00, 20.00, 18.22 [09:10:13] RECOVERY - mw162 Current Load on mw162 is OK: LOAD OK - total load average: 19.54, 19.85, 18.39 [09:21:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.99, 21.79, 23.49 [09:23:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.11, 22.96, 23.71 [09:25:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 20.97, 22.18, 23.33 [09:43:03] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 19.08, 18.98, 20.38 [09:58:42] PROBLEM - ns2 NTP time on ns2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:00:41] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset 0.0001399219036 secs [10:09:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 12.61, 19.48, 22.73 [10:11:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [10:15:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.68, 22.69, 22.90 [10:17:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.58, 21.89, 22.57 [10:19:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.26, 23.67, 23.15 [10:21:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.92, 23.45, 23.13 [10:23:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.37, 24.44, 23.50 [10:25:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.74, 23.55, 23.26 [10:31:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.95, 24.44, 23.54 [10:32:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.92, 19.20, 16.73 [10:34:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 16.73, 18.48, 16.79 [10:36:52] PROBLEM - ns2 NTP time on ns2 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [10:40:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [10:41:00] RECOVERY - ns2 NTP time on ns2 is OK: NTP OK: Offset 0.0001749396324 secs [10:48:16] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 4 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[git_pull_dns] [10:49:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 17.04, 21.52, 23.80 [10:55:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.75, 23.80, 23.88 [11:05:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:05:55] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.74, 3.15, 1.42 [11:07:55] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.44, 3.09, 1.63 [11:10:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:14:11] I forgot to give BeeBot's OAuth consumer edit rights, oops [11:14:28] It tried to archive one thread yesterday and couldn't because of that [11:14:31] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 44 seconds ago with 0 failures [11:15:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech FIRING: There has been a rise in the MediaWiki exception rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:16:42] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.81, 3.70, 2.48 [11:16:56] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [11:17:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 12.30, 18.99, 23.37 [11:18:38] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 4.49, 3.77, 2.63 [11:18:56] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [363 system event log (SEL) entries present] [11:20:30] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1[Grafana] !tech RESOLVED: MediaWiki Exception Rate https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:20:33] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.01, 2.96, 2.46 [11:23:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.69, 21.83, 23.22 [11:23:34] PROBLEM - cp51 Varnish Backends on cp51 is CRITICAL: 1 backends are down. mw152 [11:23:56] PROBLEM - cp27 Varnish Backends on cp27 is CRITICAL: 1 backends are down. mw152 [11:25:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.19, 23.06, 23.55 [11:25:29] RECOVERY - cp51 Varnish Backends on cp51 is OK: All 19 backends are healthy [11:25:56] RECOVERY - cp27 Varnish Backends on cp27 is OK: All 19 backends are healthy [11:26:13] !log [alex@bots171] manually ran archivebot [11:26:19] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [11:27:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.06, 22.90, 23.40 [11:28:02] [1/2] https://meta.miraheze.org/wiki/Meta:Requests_for_permissions/Archive_15?diff=prev&oldid=411456 [11:28:02] [2/2] there we go [11:29:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 19.98, 21.81, 22.94 [11:30:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.86, 19.81, 18.08 [11:31:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.51, 23.15, 23.32 [11:32:06] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.49, 3.95, 3.13 [11:32:49] PROBLEM - prometheus151 PowerDNS Recursor on prometheus151 is CRITICAL: CRITICAL - Plugin timed out while executing system call [11:32:50] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 14.40, 17.86, 17.59 [11:32:59] PROBLEM - prometheus151 SSH on prometheus151 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [11:34:21] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.48, 20.34, 19.40 [11:34:46] RECOVERY - prometheus151 PowerDNS Recursor on prometheus151 is OK: DNS OK: 0.078 seconds response time. wikitide.net returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [11:34:54] RECOVERY - prometheus151 SSH on prometheus151 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u3 (protocol 2.0) [11:35:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.50, 23.43, 23.40 [11:35:55] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.59, 3.89, 3.33 [11:36:18] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.76, 21.70, 19.98 [11:37:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.51, 23.83, 23.53 [11:37:55] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.37, 4.43, 3.60 [11:38:14] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.67, 21.07, 19.98 [11:39:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.27, 23.56, 23.47 [11:39:55] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.04, 3.82, 3.50 [11:40:11] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 27.74, 23.46, 20.97 [11:41:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 24.43, 23.42, 23.40 [11:41:55] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.06, 3.05, 3.27 [11:42:07] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.90, 23.37, 21.25 [11:43:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.75, 23.86, 23.59 [11:47:56] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 17.74, 19.05, 20.00 [11:49:56] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 2.63, 3.48, 3.37 [11:51:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.14, 23.24, 23.04 [11:53:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 17.42, 21.17, 22.33 [11:53:55] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 2.52, 3.14, 3.28 [11:59:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.04, 21.66, 21.94 [12:01:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.82, 22.40, 22.23 [12:05:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 28.53, 24.31, 22.92 [12:07:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 23.54, 23.39, 22.73 [12:13:21] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.08, 21.11, 19.85 [12:15:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.96, 22.92, 22.39 [12:16:26] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 22.07, 21.08, 19.86 [12:18:22] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 18.00, 19.79, 19.54 [12:21:07] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.22, 22.35, 21.27 [12:24:10] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.42, 22.10, 20.42 [12:26:06] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.28, 21.71, 20.43 [12:27:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.86, 23.98, 22.23 [12:27:12] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [12:27:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 14.76, 22.04, 23.34 [12:28:01] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 25.03, 22.72, 20.93 [12:29:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.14, 23.56, 22.32 [12:29:13] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [364 system event log (SEL) entries present] [12:29:57] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 19.48, 21.59, 20.74 [12:33:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 27.00, 24.92, 23.12 [12:35:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.47, 22.42, 22.42 [12:35:43] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 20.11, 20.11, 20.38 [12:37:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.57, 21.96, 22.25 [12:41:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 28.76, 24.14, 22.92 [12:43:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.13, 23.57, 23.41 [12:45:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.36, 23.76, 23.19 [12:45:24] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.97, 21.68, 20.88 [12:45:55] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [12:47:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.52, 24.42, 23.46 [12:47:19] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.75, 21.59, 20.97 [12:51:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.97, 25.25, 24.02 [12:53:06] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.99, 21.49, 20.96 [12:54:33] PROBLEM - prometheus151 Current Load on prometheus151 is CRITICAL: LOAD CRITICAL - total load average: 5.63, 4.41, 3.28 [12:55:02] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.96, 21.45, 21.03 [12:56:29] PROBLEM - prometheus151 Current Load on prometheus151 is WARNING: LOAD WARNING - total load average: 3.80, 3.77, 3.16 [12:58:25] RECOVERY - prometheus151 Current Load on prometheus151 is OK: LOAD OK - total load average: 1.87, 2.99, 2.94 [12:58:53] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 15.70, 19.03, 20.14 [13:01:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 16.05, 21.89, 23.56 [13:03:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.35, 23.45, 23.89 [13:05:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.85, 21.95, 23.29 [13:07:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.82, 22.81, 23.42 [13:09:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 22.77, 22.85, 23.38 [13:11:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 25.64, 23.58, 23.57 [13:13:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 21.29, 22.35, 23.10 [13:14:35] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [13:15:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.25, 23.05, 23.26 [13:17:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 16.24, 20.25, 22.20 [13:17:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.77, 21.91, 23.84 [13:17:53] PROBLEM - cp51 Varnish Backends on cp51 is CRITICAL: 1 backends are down. mw152 [13:19:48] RECOVERY - cp51 Varnish Backends on cp51 is OK: All 19 backends are healthy [13:21:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 27.22, 23.53, 23.97 [13:23:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 20.60, 22.16, 23.41 [13:25:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 28.66, 22.91, 22.19 [13:27:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.79, 24.09, 23.88 [13:31:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.40, 23.15, 22.78 [13:36:50] PROBLEM - beta.sagan4.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'beta.sagan4.org' expires in 15 day(s) (Sun 11 Aug 2024 01:06:18 PM GMT +0000). [13:37:03] [02ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/2dcdfa59bb6f...69a06583f783 [13:37:04] [02ssl] 07WikiTideSSLBot 0369a0658 - Bot: Update SSL cert for beta.sagan4.org [13:37:27] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [13:39:27] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [365 system event log (SEL) entries present] [13:39:42] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 20.76, 20.71, 18.83 [13:40:30] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [13:41:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 24.31, 22.03, 22.18 [13:41:38] RECOVERY - mw182 Current Load on mw182 is OK: LOAD OK - total load average: 14.12, 18.39, 18.21 [13:43:01] PROBLEM - mw152 MediaWiki Rendering on mw152 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [13:43:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 19.61, 21.71, 22.09 [13:43:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.06, 22.61, 23.93 [13:46:23] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [13:49:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 28.19, 24.20, 24.01 [13:49:23] RECOVERY - mw152 MediaWiki Rendering on mw152 is OK: HTTP OK: HTTP/1.1 200 OK - 8191 bytes in 3.423 second response time [13:51:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 22.71, 23.18, 23.65 [13:53:03] PROBLEM - mw181 Current Load on mw181 is CRITICAL: LOAD CRITICAL - total load average: 26.57, 23.48, 22.45 [13:55:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 28.31, 25.18, 24.32 [13:59:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 18.37, 22.79, 23.66 [14:01:03] PROBLEM - mw181 Current Load on mw181 is WARNING: LOAD WARNING - total load average: 17.59, 21.77, 22.46 [14:05:41] RECOVERY - beta.sagan4.org - LetsEncrypt on sslhost is OK: OK - Certificate 'beta.sagan4.org' will expire on Thu 24 Oct 2024 12:36:55 PM GMT +0000. [14:07:03] RECOVERY - mw181 Current Load on mw181 is OK: LOAD OK - total load average: 14.92, 16.38, 19.74 [14:09:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:11:19] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 13.03, 16.51, 20.09 [14:14:00] [Grafana] RESOLVED: PHP-FPM Worker Usage High https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [14:47:47] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [14:49:48] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [366 system event log (SEL) entries present] [15:11:41] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 11.96, 18.49, 23.51 [15:15:31] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [15:17:41] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 17.78, 14.72, 19.96 [15:21:41] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 25.01, 20.54, 21.14 [15:33:41] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 21.00, 23.55, 23.22 [15:33:53] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 21.75, 19.93, 17.91 [15:35:47] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 10.50, 16.50, 16.90 [15:35:55] PROBLEM - cloud15 IPMI Sensors on cloud15 is UNKNOWN: ipmi_sdr_cache_open: /root/.freeipmi/sdr-cache/sdr-cache-cloud15.localhost: internal IPMI error-> Execution of /usr/sbin/ipmi-sel failed with return code 1.-> /usr/sbin/ipmi-sel was executed with the following parameters: sudo /usr/sbin/ipmi-sel --output-event-state --interpret-oem-data --entity-sensor-names --sensor-types=all [15:37:56] PROBLEM - cloud15 IPMI Sensors on cloud15 is CRITICAL: IPMI Status: Critical [367 system event log (SEL) entries present] [15:41:41] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 18.84, 18.55, 20.37 [15:45:39] PROBLEM - ns2 Puppet on ns2 is CRITICAL: CRITICAL: Failed to apply catalog, zero resources tracked by Puppet. It might be a dependency cycle. [15:46:21] !log [alex@mwtask181] sudo -u www-data php /srv/mediawiki/1.42/maintenance/run.php /srv/mediawiki/1.42/maintenance/rebuildtextindex.php --wiki=chinafakewiki (START) [15:46:26] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:47:10] !log [alex@mwtask181] sudo -u www-data php /srv/mediawiki/1.42/maintenance/run.php /srv/mediawiki/1.42/maintenance/rebuildtextindex.php --wiki=chinafakewiki (END - exit=256) [15:47:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [15:57:41] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 27.38, 22.11, 19.71 [15:59:41] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 23.88, 22.77, 20.26 [16:01:41] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 26.55, 23.98, 21.00 [16:03:41] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 14.70, 21.33, 20.47 [16:05:41] RECOVERY - mw152 Current Load on mw152 is OK: LOAD OK - total load average: 11.41, 17.78, 19.26 [16:06:07] PROBLEM - cp27 Varnish Backends on cp27 is CRITICAL: 1 backends are down. mw181 [16:07:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.45, 20.16, 17.24 [16:08:03] RECOVERY - cp27 Varnish Backends on cp27 is OK: All 19 backends are healthy [16:08:04] PROBLEM - cp37 Varnish Backends on cp37 is CRITICAL: 1 backends are down. mw181 [16:09:19] RECOVERY - mw151 Current Load on mw151 is OK: LOAD OK - total load average: 19.03, 19.77, 17.48 [16:09:34] PROBLEM - cp51 Varnish Backends on cp51 is CRITICAL: 1 backends are down. mw152 [16:10:00] RECOVERY - cp37 Varnish Backends on cp37 is OK: All 19 backends are healthy [16:11:29] RECOVERY - cp51 Varnish Backends on cp51 is OK: All 19 backends are healthy [16:13:41] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 29.59, 21.41, 19.50 [16:13:54] RECOVERY - ns2 Puppet on ns2 is OK: OK: Puppet is currently enabled, last run 46 seconds ago with 0 failures [16:15:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 25.06, 22.30, 19.16 [16:17:56] PROBLEM - cp27 Varnish Backends on cp27 is CRITICAL: 1 backends are down. mw181 [16:18:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 24.08, 19.87, 16.45 [16:20:50] PROBLEM - mw182 Current Load on mw182 is WARNING: LOAD WARNING - total load average: 23.97, 21.12, 17.31 [16:21:56] RECOVERY - cp27 Varnish Backends on cp27 is OK: All 19 backends are healthy [16:25:19] PROBLEM - mw151 Current Load on mw151 is WARNING: LOAD WARNING - total load average: 13.42, 22.12, 21.66 [16:25:41] PROBLEM - mw152 Current Load on mw152 is WARNING: LOAD WARNING - total load average: 16.45, 23.28, 22.83 [16:25:58] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 20.51, 18.66, 15.24 [16:26:11] PROBLEM - mw171 Current Load on mw171 is WARNING: LOAD WARNING - total load average: 22.46, 20.84, 15.96 [16:27:00] [Grafana] FIRING: Some MediaWiki Appservers are running out of PHP-FPM workers. https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [16:27:19] PROBLEM - mw151 Current Load on mw151 is CRITICAL: LOAD CRITICAL - total load average: 26.95, 24.13, 22.43 [16:27:41] PROBLEM - mw152 Current Load on mw152 is CRITICAL: LOAD CRITICAL - total load average: 28.06, 25.47, 23.69 [16:28:11] PROBLEM - mw171 Current Load on mw171 is CRITICAL: LOAD CRITICAL - total load average: 26.79, 22.58, 17.15 [16:28:50] PROBLEM - mw182 Current Load on mw182 is CRITICAL: LOAD CRITICAL - total load average: 26.46, 22.38, 19.30 [16:29:58] PROBLEM - mw161 Current Load on mw161 is CRITICAL: LOAD CRITICAL - total load average: 27.14, 22.35, 17.39 [16:31:58] PROBLEM - mw161 Current Load on mw161 is WARNING: LOAD WARNING - total load average: 20.19, 20.87, 17.43 [16:32:13] PROBLEM - ns2 NTP time on ns2 is UNKNOWN: check_ntp_time: Invalid hostname/address - time.cloudflare.comUsage: check_ntp_time -H [-4|-6] [-w ] [-c ] [-v verbose] [-o