[00:02:24] PROBLEM - cp37 Disk Space on cp37 is WARNING: DISK WARNING - free space: / 9103MiB (10% inode=98%); [00:04:24] RECOVERY - cp37 Disk Space on cp37 is OK: DISK OK - free space: / 21632MiB (24% inode=98%); [00:05:35] [Grafana] !sre FIRING: The mediawiki job queue has more than 500 unclaimed jobs https://grafana.miraheze.org/d/GtxbP1Xnk?orgId=1 [00:15:35] [Grafana] !sre RESOLVED: High Job Queue Backlog https://grafana.miraheze.org/d/GtxbP1Xnk?orgId=1 [00:44:43] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [01:13:44] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [01:42:43] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp37.wikitide.net - CNAME OK [03:00:13] RECOVERY - db181 Backups SQL on db181 is OK: FILE_AGE OK: /var/log/sql-backup.log is 12 seconds old and 0 bytes [03:01:01] RECOVERY - db161 Backups SQL on db161 is OK: FILE_AGE OK: /var/log/sql-backup.log is 60 seconds old and 0 bytes [03:14:42] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query wiki.mahdiruiz.line.pm. IN CNAME: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [03:41:26] PROBLEM - wiki.andreijiroh.uk.eu.org - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.andreijiroh.uk.eu.org All nameservers failed to answer the query. [03:43:43] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [03:58:36] !log [macfan@mwtask181] sudo -u www-data php /srv/mediawiki/1.41/maintenance/run.php /srv/mediawiki/1.41/maintenance/importImages.php --wiki=beidipediawiki /home/macfan/images --search-recursively (START) [03:58:44] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [04:12:47] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp37.wikitide.net - CNAME OK [04:39:24] RECOVERY - wiki.andreijiroh.uk.eu.org - reverse DNS on sslhost is OK: SSL OK - wiki.andreijiroh.uk.eu.org reverse DNS resolves to cp36.wikitide.net - CNAME OK [05:32:02] PROBLEM - db181 PowerDNS Recursor on db181 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:32:02] PROBLEM - db181 Current Load on db181 is CRITICAL: LOAD CRITICAL - total load average: 111.68, 49.97, 19.78 [05:32:47] PROBLEM - db181 Puppet on db181 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:33:22] PROBLEM - db181 SSH on db181 is CRITICAL: CRITICAL - Socket timeout after 10 seconds [05:34:11] RECOVERY - db181 PowerDNS Recursor on db181 is OK: DNS OK: 0.830 seconds response time. miraheze.org returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:39:26] PROBLEM - db181 PowerDNS Recursor on db181 is CRITICAL: CRITICAL - Plugin timed out while executing system call [05:41:02] PROBLEM - db181 conntrack_table_size on db181 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:42:38] PROBLEM - db181 Backups SQL mhglobal on db181 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:42:38] PROBLEM - db181 Backups SQL on db181 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:45:46] PROBLEM - db181 ferm_active on db181 is CRITICAL: CHECK_NRPE STATE CRITICAL: Socket timeout after 60 seconds. [05:49:36] RECOVERY - db181 PowerDNS Recursor on db181 is OK: DNS OK: 0.205 seconds response time. miraheze.org returns 2602:294:0:b13::110,2602:294:0:b23::112,38.46.223.205,38.46.223.206 [05:49:54] RECOVERY - db181 conntrack_table_size on db181 is OK: OK: nf_conntrack is 0 % full [05:49:56] RECOVERY - db181 SSH on db181 is OK: SSH OK - OpenSSH_9.2p1 Debian-2+deb12u2 (protocol 2.0) [05:50:11] PROBLEM - cp36 Varnish Backends on cp36 is CRITICAL: 1 backends are down. mw182 [05:50:40] RECOVERY - db181 ferm_active on db181 is OK: OK ferm input default policy is set [05:50:41] RECOVERY - db181 Backups SQL mhglobal on db181 is OK: FILE_AGE OK: /var/log/sql-mhglobal-backup-weekly.log is 175832 seconds old and 208 bytes [05:50:41] RECOVERY - db181 Backups SQL on db181 is OK: FILE_AGE OK: /var/log/sql-backup.log is 968 seconds old and 47730 bytes [05:51:08] RECOVERY - db181 Puppet on db181 is OK: OK: Puppet is currently enabled, last run 26 seconds ago with 0 failures [05:52:11] RECOVERY - cp36 Varnish Backends on cp36 is OK: All 19 backends are healthy [06:24:26] PROBLEM - db181 Current Load on db181 is WARNING: LOAD WARNING - total load average: 0.07, 0.35, 11.37 [06:26:26] RECOVERY - db181 Current Load on db181 is OK: LOAD OK - total load average: 0.35, 0.36, 10.04 [06:45:51] PROBLEM - cloud17 Puppet on cloud17 is CRITICAL: CRITICAL: Puppet has 1 failures. Last run 2 minutes ago with 1 failures. Failed resources (up to 3 shown): Service[ulogd2] [06:48:24] !log [macfan@mwtask181] sudo -u www-data php /srv/mediawiki/1.41/maintenance/run.php /srv/mediawiki/1.41/maintenance/importImages.php --wiki=beidipediawiki /home/macfan/images --search-recursively (END - exit=0) [06:48:32] Logged the message at https://meta.miraheze.org/wiki/Tech:Server_admin_log [07:13:50] RECOVERY - cloud17 Puppet on cloud17 is OK: OK: Puppet is currently enabled, last run 23 seconds ago with 0 failures [12:44:37] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [13:01:42] PROBLEM - ping6 on cp41 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 194.92 ms [13:05:44] RECOVERY - ping6 on cp41 is OK: PING OK - Packet loss = 0%, RTA = 164.83 ms [13:13:39] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [13:42:40] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [13:43:47] PROBLEM - wiki.walkscape.app - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.walkscape.app' expires in 15 day(s) (Thu 21 Mar 2024 01:13:18 PM GMT +0000). [13:43:59] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/4bacf9fbc211...299014742613 [13:44:02] [02miraheze/ssl] 07MirahezeSSLBot 032990147 - Bot: Update SSL cert for wiki.walkscape.app [14:06:26] PROBLEM - ping6 on cp41 is CRITICAL: PING CRITICAL - Packet loss = 16%, RTA = 177.89 ms [14:08:26] RECOVERY - ping6 on cp41 is OK: PING OK - Packet loss = 0%, RTA = 157.36 ms [14:31:55] PROBLEM - ping6 on cp41 is CRITICAL: PING CRITICAL - Packet loss = 28%, RTA = 160.15 ms [14:33:56] RECOVERY - ping6 on cp41 is OK: PING OK - Packet loss = 0%, RTA = 168.90 ms [14:43:21] RECOVERY - wiki.walkscape.app - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.walkscape.app' will expire on Mon 03 Jun 2024 12:43:53 PM GMT +0000. [15:12:30] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp36.wikitide.net - CNAME OK [16:30:56] @Site Reliability Engineers is Infra still exploring the viability of Graylog, as [[Tech:Graylog]] says? I feel Miraheze already has plenty of experience with Graylog [16:30:56] https://meta.miraheze.org/wiki/Tech:Graylog [16:30:57] [18:00:08] PROBLEM - wiki.seamly.net - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.seamly.net All nameservers failed to answer the query. [18:04:34] PROBLEM - wiki.seamly.io - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.seamly.io All nameservers failed to answer the query. [18:06:55] PROBLEM - mhdh.pj568.eu.org - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'mhdh.pj568.eu.org' expires in 15 day(s) (Thu 21 Mar 2024 05:54:02 PM GMT +0000). [18:07:09] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/299014742613...1f750367e342 [18:07:13] [02miraheze/ssl] 07MirahezeSSLBot 031f75036 - Bot: Update SSL cert for mhdh.pj568.eu.org [18:13:23] PROBLEM - wiki.moores.tech - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.moores.tech' expires in 15 day(s) (Thu 21 Mar 2024 06:05:31 PM GMT +0000). [18:13:34] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/1f750367e342...05746079a911 [18:13:36] [02miraheze/ssl] 07MirahezeSSLBot 030574607 - Bot: Update SSL cert for wiki.moores.tech [18:22:06] PROBLEM - wiki.zamnhacking.net - LetsEncrypt on sslhost is WARNING: WARNING - Certificate 'wiki.zamnhacking.net' expires in 15 day(s) (Thu 21 Mar 2024 06:04:37 PM GMT +0000). [18:22:18] [02miraheze/ssl] 07MirahezeSSLBot pushed 031 commit to 03master [+0/-0/±1] 13https://github.com/miraheze/ssl/compare/05746079a911...2252f57ee3a8 [18:22:21] [02miraheze/ssl] 07MirahezeSSLBot 032252f57 - Bot: Update SSL cert for wiki.zamnhacking.net [18:30:09] PROBLEM - wiki.seamly.net - reverse DNS on sslhost is WARNING: SSL WARNING - rDNS OK but records conflict. {'NS': ['ns3.dreamhost.com.', 'ns1.dreamhost.com.', 'ns2.dreamhost.com.'], 'CNAME': 'wiki.seamly.io.'} [18:34:08] RECOVERY - wiki.seamly.io - reverse DNS on sslhost is OK: SSL OK - wiki.seamly.io reverse DNS resolves to cp37.wikitide.net - CNAME OK [18:35:50] RECOVERY - mhdh.pj568.eu.org - LetsEncrypt on sslhost is OK: OK - Certificate 'mhdh.pj568.eu.org' will expire on Mon 03 Jun 2024 05:07:04 PM GMT +0000. [18:44:35] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [18:51:11] RECOVERY - wiki.zamnhacking.net - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.zamnhacking.net' will expire on Mon 03 Jun 2024 05:22:12 PM GMT +0000. [19:12:08] RECOVERY - wiki.moores.tech - LetsEncrypt on sslhost is OK: OK - Certificate 'wiki.moores.tech' will expire on Mon 03 Jun 2024 05:13:28 PM GMT +0000. [19:13:36] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [19:42:37] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [20:42:31] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is CRITICAL: rDNS CRITICAL - wiki.mahdiruiz.line.pm All nameservers failed to answer the query. [21:12:31] PROBLEM - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is WARNING: NoNameservers: All nameservers failed to answer the query line.pm. IN NS: Server 2606:4700:4700::1111 UDP port 53 answered The DNS operation timed out.; Server 2606:4700:4700::1111 UDP port 53 answered SERVFAIL [21:42:33] RECOVERY - wiki.mahdiruiz.line.pm - reverse DNS on sslhost is OK: SSL OK - wiki.mahdiruiz.line.pm reverse DNS resolves to cp36.wikitide.net - CNAME OK