[00:42:01] [02puppet] 07MacFan4000 pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JEEid [00:42:02] [02puppet] 07MacFan4000 03e5f82d0 - keep the puppet reports directory tidy - only keep reports for the past week this should help save a fair amount of disk space on bots1/tools1 [00:47:18] Notice: /Stage[main]/Profile::Base/Tidy[/var/lib/puppet/reports]: Tidying 57977 files [00:47:26] yeah, that's a lot [00:47:59] and that's just for bots1 [00:54:26] for tools1 it's another 57000+ files [01:09:23] PROBLEM - load on tools1 is WARNING: WARNING - load average: 5.19, 3.71, 2.19 [01:10:23] RECOVERY - load on tools1 is OK: OK - load average: 3.08, 3.40, 2.18 [01:12:52] on tools1 we've regained ~10g of space [01:19:09] for bots1 ~4g of space is regained [07:19:34] PROBLEM - swap on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:19:40] PROBLEM - discordirc-buff service on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:19:42] PROBLEM - discordirc-mh service on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:19:46] PROBLEM - Discord-relay on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:19:52] PROBLEM - discordirc-fhl service on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:19:58] PROBLEM - ping4 #page on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:20:00] PROBLEM - discordirc-fhf service on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [07:20:02] PROBLEM - Host bots1 is DOWN: PING CRITICAL - Packet loss = 100% [07:43:26] Reception123: we lost the bots [07:45:37] RhinosF1: hmm, how? [07:46:02] Reception123: hypervisors in LAX have too high packetloss [07:46:09] Icinga was screaming before I joined [07:46:11] oh [07:46:26] well that's not fun [07:48:09] Reception123: discord people can check https://status.mirahezebots.org/dashboard/issues/61289873386fb307bab05ad6 or follow us on Twitter [07:48:23] https://status.mirahezebots.org/issues/61289873386fb307bab05ad6 [07:49:19] RhinosF1: I guess I should write up an announcement then [07:49:31] Yep [07:50:03] $log shift web traffic towards tools [07:50:10] Saved item "shift web traffic towards tools" [07:51:12] RhinosF1: how's "Due to a technical issue with the MirahezeBots servers the IRC-Discord relay service is temporarily unavailable. For more details check out: https://status.mirahezebots.org/ or follow @MirahezeB on twitter" [07:51:40] Reception123: ye that's good [07:51:46] ok, posting then [08:16:56] RECOVERY - Host bots1 is UP: PING OK - Packet loss = 0%, RTA = 135.67 ms [08:16:58] from /usr/lib/nagios/plugins/check_puppet_run:154:in `
' [08:17:00] RECOVERY - ping6 #page on bots1 is OK: PING OK - Packet loss = 0%, RTA = 0.12 ms [08:17:34] RECOVERY - swap on bots1 is OK: SWAP OK - 93% free (3714 MB out of 3996 MB) [08:17:44] Reception123: 3 hours of emergency maintenance [08:17:54] Hmm [08:20:17] Reception123: it says 60 minutes downtime so hopefully no interruptions but we can only watch [08:20:23] We're halfway into a 6 hour window [08:20:44] Ack [08:20:58] RECOVERY - Puppet on bots1 is OK: OK: Puppet is currently enabled, last run 17 seconds ago with 0 failures [08:23:29] :) [08:23:34] We're back all green [08:26:23] Yay [08:40:02] PROBLEM - discordirc-fhf service on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [08:40:06] PROBLEM - Host bots1 is DOWN: PING CRITICAL - Packet loss = 100% [08:56:02] RECOVERY - Host bots1 is UP: PING OK - Packet loss = 0%, RTA = 135.65 ms [08:56:02] PROBLEM - ssh #page on bots1 is UNKNOWN: Remote Icinga instance '100.node4.net.fosshost.org' is not connected to '112.node1.net.fosshost.org' [08:56:02] RECOVERY - streambot service on bots1 is OK: OK: Status of the systemd unit streambot [08:56:02] from /usr/lib/nagios/plugins/check_puppet_run:154:in `
' [08:56:03] RECOVERY - ping6 #page on bots1 is OK: PING OK - Packet loss = 0%, RTA = 0.15 ms [08:56:06] RECOVERY - swap on bots1 is OK: SWAP OK - 93% free (3714 MB out of 3996 MB) [08:56:11] RECOVERY - discordirc-buff service on bots1 is OK: OK: Status of the systemd unit discordircbuff [08:56:15] RECOVERY - ping4 #page on bots1 is OK: PING OK - Packet loss = 0%, RTA = 0.09 ms [08:56:17] RECOVERY - ufw service on bots1 is OK: OK: Status of the systemd unit ufw [08:56:19] RECOVERY - sopel prod service on bots1 is OK: OK: Status of the systemd unit mirahezebotprodlibera [08:56:21] RECOVERY - discordirc-fhl service on bots1 is OK: OK: Status of the systemd unit discordircfhlibera [08:56:23] RECOVERY - Flask-site on bots1 is OK: HTTP OK: HTTP/1.1 200 OK - 5424 bytes in 0.065 second response time [08:56:31] RECOVERY - discordirc-fhf service on bots1 is OK: OK: Status of the systemd unit discordircfhfreenode [08:56:39] RECOVERY - Apache on bots1 is OK: PROCS OK: 11 processes with command name 'apache2' [09:00:59] RECOVERY - Puppet on bots1 is OK: OK: Puppet is currently enabled, last run 16 seconds ago with 0 failures [21:23:10] [02puppet] 07RhinosF1 pushed 031 commit to 03RhinosF1-patch-1 [+0/-0/±1] 13https://git.io/JE2Z6 [21:23:12] [02puppet] 07RhinosF1 03596643e - base: clear logs older than a month We don't need them [21:23:13] [02puppet] 07RhinosF1 created branch 03RhinosF1-patch-1 - 13https://git.io/JJGvA [21:23:15] [02puppet] 07RhinosF1 opened pull request 03#206: base: clear logs older than a month - 13https://git.io/JE2ZP [21:23:41] MacFan4000: I sent that if you want to deploy ^ [21:23:57] I also would like a check in icinga that runs the uptime command [21:24:03] I don't care if it's always OK [21:24:15] but I just want it there so we can see the output of it [21:25:57] Ok, I’ll pull that but pretty soon I’ll be away until Sunday [21:26:24] MacFan4000: ok [21:26:44] After the weekend T274,275&276 would probably be cool too [21:33:36] [02puppet] 07MacFan4000 closed pull request 03#206: base: clear logs older than a month - 13https://git.io/JE2ZP [21:33:37] [02puppet] 07MacFan4000 pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JE2nD [21:33:39] [02puppet] 07RhinosF1 033602ed3 - base: clear logs older than a month (#206) We don't need them [21:38:30] [02puppet] 07MacFan4000 deleted branch 03RhinosF1-patch-1 [21:38:32] [02puppet] 07MacFan4000 deleted branch 03RhinosF1-patch-1 - 13https://git.io/JJGvA [21:38:36] [02puppet] 07MacFan4000 pushed 031 commit to 03revert-206-RhinosF1-patch-1 [+0/-0/±1] 13https://git.io/JE2cn [21:38:37] [02puppet] 07MacFan4000 0397dff22 - Revert "base: clear logs older than a month (#206)" This reverts commit 3602ed3ab1bbf68d68db2443a8c396711603a410. [21:38:39] [02puppet] 07MacFan4000 created branch 03revert-206-RhinosF1-patch-1 - 13https://git.io/JJGvA [21:38:41] [02puppet] 07MacFan4000 opened pull request 03#207: Revert "base: clear logs older than a month" - 13https://git.io/JE2cC [21:38:48] [02puppet] 07MacFan4000 closed pull request 03#207: Revert "base: clear logs older than a month" - 13https://git.io/JE2cC [21:38:50] [02puppet] 07MacFan4000 pushed 031 commit to 03master [+0/-0/±1] 13https://git.io/JE2cW [21:38:51] [02puppet] 07MacFan4000 03393ec96 - Revert "base: clear logs older than a month (#206)" (#207) This reverts commit 3602ed3ab1bbf68d68db2443a8c396711603a410. [21:38:56] [02puppet] 07MacFan4000 deleted branch 03revert-206-RhinosF1-patch-1 - 13https://git.io/JJGvA [21:38:58] [02puppet] 07MacFan4000 deleted branch 03revert-206-RhinosF1-patch-1