[00:57:54] alr, just wanted to make sure I didn't remove your shell access too early in case you still needed it lol [17:25:47] Not urgent, but was curious, https://grafana.wikitide.net/d/GtxbP1Xnk/mediawiki appears to be broken? For me it shows no data in any of the graphs [17:26:17] Nevermind it fixed itself [17:26:53] you're welcome :3 [17:27:09] Lol [18:05:05] ^ [18:05:19] Yeah I saw [21:19:02] Hmm, seems like dropping dbs is likely the cause of bad gateways [21:19:13] Any ideas for how I could prevent that next time? [21:19:20] we've still got at the very least 3k wikis to be dropped [21:19:55] I don't really remember it causing issues in the past though [21:21:03] db load isn't even high [21:21:05] why are we down [21:21:51] it definitely has to be db related as this happened yesterday at the exact same time I dropped dbs as well [21:22:44] Could it be batch size? [21:23:05] looks like we're partially back, meta at least is slowly loading again for me [21:23:30] We're still down for me [21:23:51] Yesterday's batch was quite small and it still happened [21:24:04] and like I said I don't really remember there ever being issues in the past (i.e. pre-September 2024 when the last deletion happened) [21:24:05] nothing in the logs on mw161 or db192 [21:24:16] maybe we should figure out a script that makes it wait 15-20 seconds between each drop? [21:24:51] nvm we're still down [21:25:33] db load is quite high [21:25:38] is it? [21:25:39] which db [21:25:45] db192 wouldn't be affected as the dropped wikis aren't on there [21:25:57] yeah but meta was unreachable [21:25:59] which is on db192 [21:26:08] drops are on db161, 171, 151, 181 [21:26:46] then it doesn't make any sense that wikis on db192 would be down [21:27:06] mhglobal? [21:27:16] db192 appears to be fine [21:27:21] the issue is that there are no errors anywhere [21:27:25] neither on appservers nor on dbs [21:27:36] Weird [21:27:46] doesnt affect communities [21:28:26] it's altogether weird that wikis are down due to dropping though as like I said it hasn't happened before afaik [21:28:46] It's definitely weird [21:29:01] it's not like I pasted 5k drop entries [21:30:09] @cosmicalpha any idea? [21:30:46] some servers are still going down for some reason [21:31:11] Could it be a ManageWiki cache problem rather than db? [21:32:33] my wiki is on db171 and im experiencing issues [21:32:41] all wikis are [21:32:59] Is the website down? [21:33:00] (experiencing issues) [21:33:03] Yes [21:33:06] Ah alright [21:33:11] Tought it was my internet [21:33:39] :peeble: [21:34:39] db192 has zero active queries in the processlist [21:35:14] This seems to be related to cp? [21:35:29] nginx requests have dropped so might be [21:35:40] https://cdn.discordapp.com/attachments/1006789349498699827/1460024387221590178/image.png?ex=6965692b&is=696417ab&hm=487e46cd23c2a29559486a1269c4937ade324ef4c3d9788f0d19ef268c1c2148& [21:36:05] meanwhile varnish requests look normal [21:37:11] Icinga reports everything is fine again [21:37:22] I forced pools back up then to auto again [21:39:56] alr [21:40:10] so ig it might not be related to the wiki deletions then [21:40:27] It could be [21:40:33] how? [21:41:52] Those caused mw servers to fail to connect to db for a moment which caused varnish to mark servers as down, but since all servers were marked it wasn't responding to the ping reliably for healthcheck. Thats just an assumption though It also could be totally unrelated. [21:44:56] oh [21:45:26] then it would also make sense that some servers were healthy for a short amount of time and then critical again [21:46:43] Yesterday I thought it could be a coincidence but now for the second time I'm quite sure it isn't [21:47:12] Though I don't know what I should do next time to prevent it, other than figuring out a sort of "sleep" command between drops [21:49:31] Would DO SLEEP(time in seconds); work? [21:49:48] Iirc we use MariaDB [21:50:16] Oh I didn't know you could [21:50:40] In that case yeah that'd be perfect [21:51:04] yes, that should work, I used it once [21:51:29] Does it set it for the whole session though or do I have to insert it in between each drop (not manually of course)? [21:51:47] You would do query; sleep; query; I believe [21:52:03] yep otherwise it only sleeps once [21:52:22] Thought so, should be quite easy to insert in between each line [21:53:02] Idk why we dont have a script, wouldnt think it would be hard [21:53:53] Just have the script replace a variable after DROP with the DB name from a textfile, then have it do a sleep