[13:19:32] Amir1: db1100 old s5 master, is still depooled from what I can see [13:19:54] marostegui: I finished my last change on it a couple hours ago [13:20:04] I think it's safe to repool now [13:20:36] ah ok, you take care of it? [13:20:48] yup [13:20:56] thanks [13:24:15] marostegui: I'm not believing my eyes. Can you double check something for me? can you do processlist on db1166 and check how long the proofread queries have been taking? [13:24:38] sure [13:24:59] They come from mwmaint1002 no? [13:25:03] yup [13:25:40] I guess they can be killed :) [13:25:55] yeah but is that 700 hours there? right? [13:25:58] yes [13:26:03] and the explain isn't that bad really [13:28:03] killed them all [15:31:30] Hi DBAs! The Campaigns team would like to create the schema for the CampaignEvents extension this week (https://phabricator.wikimedia.org/T318595). We read the documentation and can (try and) do it ourselves, but I'd love it if at least one of you could be around at the time just to make sure we don't destroy everything, or if we need help. We were thinking of doing it on Wednesday at 15:00 UTC, but then I learned it would [15:31:30] conflict with the tech dept meeting. What would a good time be for you? [15:33:13] hi :) can I pick your brains on T322039? [15:33:14] T322039: [toolsdb] clouddb1002 stopped replicating from clouddb1001 - https://phabricator.wikimedia.org/T322039 [15:42:43] dhinus: I'm on my phone but it looks like you have data inconsistencies [15:43:12] Daimona: sure thing, drop me an invite [15:43:20] dhinus: let me take a look [15:43:32] marostegui: isn't it like midnight for you? go rest! [15:43:49] haha [15:47:17] Amir1: thank you, I invited you. Please let me know if this time does not work for you! [15:49:26] dhinus: was I too late? the replication looks flowing to me [15:49:29] Amir1: I added a table to Replicate_Wild_Ignore_Table [15:49:36] and that seemed to have restarted things [15:50:11] yeah I see [15:50:29] come to -cloud-admin so we don't duplicate info [15:50:51] but I'll add everything to the phab as well [15:58:33] dhinus: you aware that you are creating more inconsistencies with that option, right? [15:58:45] if it is ok to ignore that data, then that's fine of course [16:00:09] marostegui: yes, but that table was apparently problematic, a similar error appeared two years ago, and a patch to exclude it from replication was created (but not merged) last July [16:00:19] so I think it's an acceptable tradeoff for now [16:00:58] dhinus: ok, yeah. there used to be tables on the old toolsdb servers that were not replicated (their users were informed). so this might be another case of those [16:01:37] yes, though the other ones were primarily excluded because of the size, not inconsistencies. it would be good to know why this inconsistency appeared multiple times on that table... [16:01:56] good point about informing the user [16:02:51] dhinus: Basically we told them they their data was not going to be present on the replica for $reasons so if the master failed they would have no data on the new master [16:03:36] dhinus: https://phabricator.wikimedia.org/T127164 [16:03:38] dhinus: they inconsistencies could have been there from before and simply showed up now cause the row was touched (or if you switched from SBR to RBR which I don't think?) [16:04:23] yeah I also though maybe the table is touched every x months, and when it happens, it crashes the replication [16:04:26] *thought [16:04:41] not sure about SBR/RBR [16:04:55] I think there is also: https://phabricator.wikimedia.org/T236101 [16:04:57] if you are using RBR it will break replication for sure [16:05:50] jynus: thanks for the links [16:06:18] to be clear, those happened long time ago and many things may have changed, etc. [16:06:36] but I think for context they are useful