[00:01:50] RECOVERY - cp51 Puppet on cp51 is OK: OK: Puppet is currently enabled, last run 1 minute ago with 0 failures [00:07:46] RECOVERY - wiki.sheepservermc.net - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.sheepservermc.net' will expire on Tue 23 Jul 2024 10:09:18 PM GMT +0000. [00:39:56] PROBLEM - db181 Current Load on db181 is WARNING: LOAD WARNING - total load average: 9.68, 11.82, 5.39 [00:41:55] RECOVERY - db181 Current Load on db181 is OK: LOAD OK - total load average: 1.64, 8.04, 4.79 [00:50:47] [02miraheze/ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/cb3993700ca1...034c7485b1d0 [00:50:50] [02miraheze/ssl] 07WikiTideSSLBot 03034c748 - Bot: Add SSL cert for irad.wiki [00:51:14] [02miraheze/ssl] 07WikiTideSSLBot pushed 031 commit to 03master [+1/-0/±1] 13https://github.com/miraheze/ssl/compare/034c7485b1d0...586f9855f453 [00:51:17] [02miraheze/ssl] 07WikiTideSSLBot 03586f985 - Bot: Add SSL cert for wiki.wubbygame.com [00:51:48] PROBLEM - wiki.mahdiruiz.line.pm - LetsEncrypt on sslhost is CRITICAL: Temporary failure in name resolutionHTTP CRITICAL - Unable to open TCP socket [01:20:27] RECOVERY - wiki.mahdiruiz.line.pm - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.mahdiruiz.line.pm' will expire on Fri 14 Jun 2024 04:28:50 PM GMT +0000. [01:31:52] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 164.70 ms [01:33:52] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 163.47 ms [02:05:08] PROBLEM - private.yahyabd.xyz - reverse DNS on sslhost is WARNING: LifetimeTimeout: The resolution lifetime expired after 5.402 seconds: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out. [02:16:40] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [02:34:39] PROBLEM - private.yahyabd.xyz - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - private.yahyabd.xyz All nameservers failed to answer the query. [03:00:56] RECOVERY - mon181 Backups Grafana on mon181 is OK: FILE_AGE OK: /var/log/grafana-backup.log is 36 seconds old and 93 bytes [03:12:20] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [03:42:34] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 168.07 ms [03:44:36] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 168.34 ms [03:45:43] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp37.wikitide.net - CNAME OK [04:33:31] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 170.78 ms [04:37:32] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.04 ms [04:52:56] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 171.03 ms [04:56:59] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.10 ms [05:39:40] RECOVERY - ns2 GDNSD Datacenters on ns2 is OK: OK - all datacenters are online [05:40:20] RECOVERY - cp41 PowerDNS Recursor on cp41 is OK: DNS OK: 0.678 seconds response time. wikitide.net returns 109.123.230.163,2400:d320:2161:9775::1 [06:09:54] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 170.24 ms [06:13:56] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.46 ms [06:19:25] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [06:49:08] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [06:59:38] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 170.96 ms [07:03:40] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.67 ms [07:12:20] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [07:18:53] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp37.wikitide.net - CNAME OK [07:23:03] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 170.32 ms [07:25:03] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 168.49 ms [07:42:19] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 172.21 ms [07:45:49] PROBLEM - wiki.macc.nyc - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.macc.nyc All nameservers failed to answer the query. [07:46:20] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 169.45 ms [08:09:21] !log [@mwtask171] starting deploy of {'config': True} to all [08:09:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:09:32] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 11s [08:09:41] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [08:15:00] RECOVERY - wiki.macc.nyc - reverse DNS on sslhost is OK: SSL OK - wiki.macc.nyc reverse DNS resolves to cp37.wikitide.net - CNAME OK [08:20:48] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 176.68 ms [08:24:50] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 184.54 ms [08:29:57] PROBLEM - db162 MariaDB on db162 is UNKNOWN: check_mysql: Invalid hostname/address - db162.wikitide.netUsage: check_mysql [-d database] [-H host] [-P port] [-s socket] [-u user] [-p password] [-S] [-l] [-a cert] [-k key] [-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group] [08:31:56] PROBLEM - db162 MariaDB on db162 is CRITICAL: Access denied for user 'icinga'@'2602:294:0:b12::110' (using password: YES) [08:52:11] PROBLEM - cp51 PowerDNS Recursor on cp51 is CRITICAL: Domain 'wikitide.net' was not found by the server [08:54:13] RECOVERY - cp51 PowerDNS Recursor on cp51 is OK: DNS OK: 2.475 seconds response time. wikitide.net returns 109.123.230.163,2400:d320:2161:9775::1 [09:16:02] PROBLEM - cloud17 Puppet on cloud17 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[ulogd2] [09:18:45] PROBLEM - wiki.kscucf.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.kscucf.org All nameservers failed to answer the query. [09:20:31] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 170.43 ms [09:22:31] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.76 ms [09:44:02] RECOVERY - cloud17 Puppet on cloud17 is OK: OK: Puppet is currently enabled, last run 27 seconds ago with 0 failures [09:52:36] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [09:57:55] PROBLEM - rdb151 APT on rdb151 is WARNING: APT WARNING: 0 packages available for upgrade (0 critical updates). warnings detected, errors detected. [09:58:37] PROBLEM - os151 APT on os151 is WARNING: APT WARNING: 0 packages available for upgrade (0 critical updates). warnings detected, errors detected. [09:59:56] PROBLEM - rdb151 APT on rdb151 is CRITICAL: APT CRITICAL: 49 packages available for upgrade (22 critical updates). [10:00:38] PROBLEM - os151 APT on os151 is CRITICAL: APT CRITICAL: 54 packages available for upgrade (26 critical updates). [10:00:46] PROBLEM - swiftproxy161 APT on swiftproxy161 is WARNING: APT WARNING: 0 packages available for upgrade (0 critical updates). warnings detected, errors detected. [10:02:46] PROBLEM - swiftproxy161 APT on swiftproxy161 is CRITICAL: APT CRITICAL: 23 packages available for upgrade (22 critical updates). [10:11:56] PROBLEM - wiki.andreijiroh.uk.eu.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.andreijiroh.uk.eu.org All nameservers failed to answer the query. [10:14:18] PROBLEM - wiki.orivium.io - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.orivium.io All nameservers failed to answer the query. [10:41:22] RECOVERY - wiki.andreijiroh.uk.eu.org - reverse DNS on sslhost is OK: SSL OK - wiki.andreijiroh.uk.eu.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [10:44:03] RECOVERY - wiki.orivium.io - reverse DNS on sslhost is OK: SSL OK - wiki.orivium.io reverse DNS resolves to cp37.wikitide.net - CNAME OK [10:46:46] RECOVERY - wiki.kscucf.org - reverse DNS on sslhost is OK: SSL OK - wiki.kscucf.org reverse DNS resolves to cp37.wikitide.net - CNAME OK [10:52:04] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp36.wikitide.net - CNAME OK [11:12:20] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [11:53:33] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [12:09:26] !log [@mwtask171] starting deploy of {'config': True} to all [12:09:35] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:09:37] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 10s [12:09:46] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:13:25] [02miraheze/IncidentReporting] 07translatewiki pushed 031 commit to 03master [+0/-0/±2] 13https://github.com/miraheze/IncidentReporting/compare/043799f4c643...68bdc01d29a6 [12:13:27] [02miraheze/IncidentReporting] 07translatewiki 0368bdc01 - Localisation updates from https://translatewiki.net. [12:13:29] [02miraheze/CreateWiki] 07translatewiki pushed 031 commit to 03master [+0/-0/±2] 13https://github.com/miraheze/CreateWiki/compare/e1165acc7943...36ea9424155e [12:13:30] [02miraheze/CreateWiki] 07translatewiki 0336ea942 - Localisation updates from https://translatewiki.net. [12:13:33] [02miraheze/MirahezeMagic] 07translatewiki pushed 031 commit to 03master [+0/-0/±3] 13https://github.com/miraheze/MirahezeMagic/compare/5c1b2aa35f83...52ec12711f95 [12:13:36] [02miraheze/MirahezeMagic] 07translatewiki 0352ec127 - Localisation updates from https://translatewiki.net. [12:14:49] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 169.14 ms [12:16:49] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 172.08 ms [12:16:58] miraheze/MirahezeMagic - translatewiki the build has errored. [12:18:16] miraheze/IncidentReporting - translatewiki the build passed. [12:22:09] miraheze/CreateWiki - translatewiki the build passed. [12:23:13] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [12:23:19] !log [@test151] starting deploy of {'folders': '1.41/extensions/MirahezeMagic'} to test151 [12:23:20] !log [@test151] finished deploy of {'folders': '1.41/extensions/MirahezeMagic'} to test151 - SUCCESS in 0s [12:23:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:23:28] !log [@test151] starting deploy of {'folders': '1.42/extensions/MirahezeMagic'} to test151 [12:23:29] !log [@test151] finished deploy of {'folders': '1.42/extensions/MirahezeMagic'} to test151 - SUCCESS in 0s [12:23:36] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:23:39] !log [@test151] starting deploy of {'folders': '1.43/extensions/MirahezeMagic'} to test151 [12:23:40] !log [@test151] finished deploy of {'folders': '1.43/extensions/MirahezeMagic'} to test151 - SUCCESS in 0s [12:23:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:23:54] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:24:03] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:24:12] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:32:52] !log [@mwtask181] starting deploy of {'folders': '1.41/extensions/MirahezeMagic'} to all [12:33:01] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:37:31] !log [@mwtask181] starting deploy of {'folders': '1.42/extensions/MirahezeMagic'} to all [12:37:40] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:37:42] !log [@mwtask181] finished deploy of {'folders': '1.42/extensions/MirahezeMagic'} to all - SUCCESS in 11s [12:37:51] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:37:58] !log [@mwtask171] starting deploy of {'folders': '1.41/extensions/MirahezeMagic'} to all [12:38:07] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:38:08] !log [@mwtask171] finished deploy of {'folders': '1.41/extensions/MirahezeMagic'} to all - SUCCESS in 9s [12:38:17] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:38:18] !log [@mwtask171] starting deploy of {'folders': '1.42/extensions/MirahezeMagic'} to all [12:38:26] !log [@mwtask171] finished deploy of {'folders': '1.42/extensions/MirahezeMagic'} to all - SUCCESS in 8s [12:38:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:38:35] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [12:41:30] PROBLEM - mwtask181 Puppet on mwtask181 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Exec[MediaWiki-REL1_41 MirahezeMagic Sync] [13:03:28] RECOVERY - mwtask181 Puppet on mwtask181 is OK: OK: Puppet is currently enabled, last run 37 seconds ago with 0 failures [13:22:40] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query wiki.mahdiruiz.line.pm. IN CNAME: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [14:22:05] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [14:51:49] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query wiki.mahdiruiz.line.pm. IN CNAME: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [15:09:57] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 171.19 ms [15:11:57] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.84 ms [15:12:20] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:26:16] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 167.57 ms [15:30:17] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 169.28 ms [15:32:20] [Grafana] !sre RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [15:41:30] [02puppet] 07redbluegreenhat opened pull request 03#3843: Add backup SSH key for myself - 13https://github.com/miraheze/puppet/pull/3843 [15:41:53] [02puppet] 07redbluegreenhat edited pull request 03#3843: Add backup SSH key for myself - 13https://github.com/miraheze/puppet/pull/3843 [15:42:16] [02puppet] 07redbluegreenhat edited pull request 03#3843: Add backup SSH key for myself - 13https://github.com/miraheze/puppet/pull/3843 [15:45:06] @rhinosf1 finally saved up enough for a second yubikey [15:45:16] https://github.com/miraheze/puppet/pull/3843 [15:51:11] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp36.wikitide.net - CNAME OK [15:52:39] @bluemoon0332 cool [15:52:44] yep [15:53:04] now to keep it locked and hope I never have to use it [16:09:05] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 176.04 ms [16:13:06] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 171.04 ms [16:42:46] [02puppet] 07Universal-Omega closed pull request 03#3843: Add backup SSH key for myself - 13https://github.com/miraheze/puppet/pull/3843 [16:42:47] [02miraheze/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/puppet/compare/958dabeb1abe...b661e5bde18d [16:42:49] [02miraheze/puppet] 07redbluegreenhat 03b661e5b - Add backup SSH key for myself (#3843) [16:55:55] PROBLEM - wiki.mahdiruiz.line.pm - LetsEncrypt on sslhost is CRITICAL: Temporary failure in name resolutionHTTP CRITICAL - Unable to open TCP socket [17:02:25] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [17:52:00] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp36.wikitide.net - CNAME OK [17:52:25] [Grafana] !sre RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [17:55:38] RECOVERY - wiki.mahdiruiz.line.pm - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.mahdiruiz.line.pm' will expire on Fri 14 Jun 2024 04:28:50 PM GMT +0000. [18:47:09] RECOVERY - wiki.gab.pt.eu.org - reverse DNS on sslhost is OK: SSL OK - wiki.gab.pt.eu.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [18:54:11] PROBLEM - wiki.tulpa.info - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.tulpa.info All nameservers failed to answer the query. [19:02:53] RECOVERY - wiki.tulpa.info - reverse DNS on sslhost is OK: SSL OK - wiki.tulpa.info reverse DNS resolves to cp36.wikitide.net - CNAME OK [19:05:26] [02miraheze/dns] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/dns/compare/26d9a812003e...9f8c30e2b0dd [19:05:27] [02miraheze/dns] 07Universal-Omega 039f8c30e - Remove unused cps [19:09:49] Hm could CloudFlare be behind #Incomplete page being shown logged out [19:12:49] PROBLEM - wiki.orivium.io - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.orivium.io All nameservers failed to answer the query. [19:18:44] PROBLEM - wiki.kscucf.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.kscucf.org All nameservers failed to answer the query. [19:23:32] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 169.29 ms [19:26:19] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [19:27:35] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 170.04 ms [19:32:46] PROBLEM - mw171 APT on mw171 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:33:17] PROBLEM - mw151 APT on mw151 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:34:08] PROBLEM - mwtask181 APT on mwtask181 is CRITICAL: APT CRITICAL: 2 packages available for upgrade (1 critical updates). [19:34:15] PROBLEM - mw172 APT on mw172 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:34:29] PROBLEM - mwtask171 APT on mwtask171 is CRITICAL: APT CRITICAL: 2 packages available for upgrade (1 critical updates). [19:34:35] PROBLEM - mw152 APT on mw152 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:36:55] PROBLEM - mw181 APT on mw181 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:38:41] PROBLEM - mw182 APT on mw182 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:38:47] PROBLEM - mw161 APT on mw161 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:39:00] PROBLEM - ns1 NTP time on ns1 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [19:39:35] PROBLEM - test151 APT on test151 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:39:57] PROBLEM - mw162 APT on mw162 is CRITICAL: APT CRITICAL: 28 packages available for upgrade (1 critical updates). [19:47:25] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [19:53:48] PROBLEM - ping on ns1 is CRITICAL: CRITICAL - Host Unreachable (10.0.17.136) [19:53:48] MacFan4000: hey are you available to try running a script? a discord support thread was opened about search being weird on an imported wiki, is it worth it to try and rebuild the search index? [19:53:54] (pocketdragonswiki) [19:54:43] RECOVERY - ns1 NTP time on ns1 is OK: NTP OK: Offset -0.0004840791225 secs [19:54:50] RECOVERY - ping on ns1 is OK: PING OK - Packet loss = 0%, RTA = 0.82 ms [19:55:42] RECOVERY - ns1 GDNSD Datacenters on ns1 is OK: OK - all datacenters are online [19:56:01] RECOVERY - ns1 Puppet on ns1 is OK: OK: Puppet is currently enabled, last run 21 seconds ago with 0 failures [20:02:25] [Grafana] !sre RESOLVED: High Job Queue Backlog https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:12:18] RECOVERY - wiki.orivium.io - reverse DNS on sslhost is OK: SSL OK - wiki.orivium.io reverse DNS resolves to cp36.wikitide.net - CNAME OK [20:16:46] RECOVERY - wiki.kscucf.org - reverse DNS on sslhost is OK: SSL OK - wiki.kscucf.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [20:25:43] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp37.wikitide.net - CNAME OK [20:41:29] RECOVERY - os161 Disk Space on os161 is OK: DISK OK - free space: / 80294MiB (36% inode=99%); [20:41:37] RECOVERY - os151 Disk Space on os151 is OK: DISK OK - free space: / 80200MiB (36% inode=99%); [20:42:25] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.wikitide.net/d/GtxbP1Xnk?orgId=1 [20:45:25] PROBLEM - os151 Current Load on os151 is CRITICAL: LOAD CRITICAL - total load average: 5.01, 2.80, 1.17 [20:47:28] PROBLEM - os161 Current Load on os161 is WARNING: LOAD WARNING - total load average: 3.62, 2.57, 1.19 [20:47:33] Fixed graylog, looks like disk usage got too high and the indices go locked. [20:47:56] 🙏 [20:48:02] ABSOLUTE VOID W [20:50:41] !log [void@phorge171] disable herald rule H70 [20:50:50] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:51:15] !log [void@graylog161] delete graylog index 76 and manually cycle active write index [20:51:24] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [20:52:06] ty [20:59:28] RECOVERY - os161 Current Load on os161 is OK: LOAD OK - total load average: 1.96, 2.94, 2.30 [21:01:25] PROBLEM - os151 Current Load on os151 is WARNING: LOAD WARNING - total load average: 2.07, 3.80, 3.35 [21:04:18] Hmm, we're getting about 80000 logs per 10 minutes with "wgEventLoggingBaseUri has not been configured," tempted to disable this channel temporarily [21:05:25] RECOVERY - os151 Current Load on os151 is OK: LOAD OK - total load average: 1.60, 2.87, 3.09 [21:18:07] [02miraheze/mw-config] 07The-Voidwalker pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/mw-config/compare/1b71dbbf03e1...d54ef32ee7c8 [21:18:08] [02miraheze/mw-config] 07The-Voidwalker 03d54ef32 - disable two overactive logging channels [21:19:04] miraheze/mw-config - The-Voidwalker the build passed. [21:22:10] Have done so, will be back in a bit to check on some other things [21:23:05] !log [@test151] starting deploy of {'config': True} to test151 [21:23:06] !log [@test151] finished deploy of {'config': True} to test151 - SUCCESS in 0s [21:23:15] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:23:23] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:32:04] !log [@mwtask181] starting deploy of {'config': True} to all [21:32:13] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:32:14] !log [@mwtask181] finished deploy of {'config': True} to all - SUCCESS in 10s [21:32:23] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:36:53] !log [@mwtask171] starting deploy of {'config': True} to all [21:37:02] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 8s [21:37:02] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:37:11] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [21:48:02] PROBLEM - cloud17 Puppet on cloud17 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 3 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[ulogd2] [22:04:57] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/mw-config/compare/d54ef32ee7c8...eef7b02868a1 [22:05:00] [02miraheze/mw-config] 07Universal-Omega 03eef7b02 - Use for Varnish also [22:05:49] miraheze/mw-config - Universal-Omega the build passed. [22:07:26] !log [@mwtask171] starting deploy of {'config': True} to all [22:07:30] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:07:35] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 8s [22:07:38] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:09:32] PROBLEM - ping6 on cp51 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 170.03 ms [22:12:17] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/mw-config/compare/eef7b02868a1...4f4ad29e7c54 [22:12:18] [02miraheze/mw-config] 07Universal-Omega 034f4ad29 - Fix [22:13:09] miraheze/mw-config - Universal-Omega the build passed. [22:13:33] RECOVERY - ping6 on cp51 is OK: PING OK - Packet loss = 0%, RTA = 168.35 ms [22:14:02] RECOVERY - cloud17 Puppet on cloud17 is OK: OK: Puppet is currently enabled, last run 19 seconds ago with 0 failures [22:22:14] [02miraheze/mw-config] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/mw-config/compare/4f4ad29e7c54...cb4f77175f05 [22:22:16] [02miraheze/mw-config] 07Universal-Omega 03cb4f771 - - for now [22:23:10] miraheze/mw-config - Universal-Omega the build passed. [22:23:47] !log [@test151] starting deploy of {'config': True} to test151 [22:23:48] !log [@test151] finished deploy of {'config': True} to test151 - SUCCESS in 0s [22:23:51] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:23:56] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:28:37] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query wiki.mahdiruiz.line.pm. IN CNAME: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [22:36:05] !log [@mwtask171] starting deploy of {'config': True} to all [22:36:10] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:36:13] !log [@mwtask171] finished deploy of {'config': True} to all - SUCCESS in 8s [22:36:17] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:40:24] PROBLEM - cp36 Varnish Backends on cp36 is CRITICAL: 1 backends are down. mw181 [22:42:24] RECOVERY - cp36 Varnish Backends on cp36 is OK: All 19 backends are healthy [22:56:27] PROBLEM - wiki.mahdiruiz.line.pm - LetsEncrypt on sslhost is CRITICAL: Temporary failure in name resolutionHTTP CRITICAL - Unable to open TCP socket [22:58:23] !log Destroyed a user on Phorge [22:58:27] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [22:59:12] [02miraheze/puppet] 07Universal-Omega pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/puppet/compare/b661e5bde18d...000e2a4a9c10 [22:59:13] [02miraheze/puppet] 07Universal-Omega 03000e2a4 - Remove Agent [23:00:35] miraheze/puppet - Universal-Omega the build has errored. [23:26:17] RECOVERY - wiki.mahdiruiz.line.pm - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.mahdiruiz.line.pm' will expire on Fri 14 Jun 2024 04:28:50 PM GMT +0000.