[04:59:04] I am going to start disconnecting codfw -> eqiad replication [04:59:05] I am going to start disconnecting codfw -> eqiad replication [06:55:58] There is something I am not getting here: https://phabricator.wikimedia.org/P17278 [06:55:59] There is something I am not getting here: https://phabricator.wikimedia.org/P17278 [06:57:01] It looks like it is not replicating optimize (but it should I just double checked the doc): https://mariadb.com/kb/en/optimize-table/ [06:57:01] It looks like it is not replicating optimize (but it should I just double checked the doc): https://mariadb.com/kb/en/optimize-table/ [07:06:14] I am not sure if I should open a bug or it is something that I am missing, cause this looks very very weird [07:06:14] I am not sure if I should open a bug or it is something that I am missing, cause this looks very very weird [07:42:49] * Emperor knows nothing, but: it says it's doing recreate + analyze instead of optimise, might that be making a difference? [07:42:49] * Emperor knows nothing, but: it says it's doing recreate + analyze instead of optimise, might that be making a difference? [07:43:20] Nah, that's the usual thing when working with InnoDB tables [07:43:20] Nah, that's the usual thing when working with InnoDB tables [07:43:22] So that's expected [07:43:22] So that's expected [07:43:44] I really cannot believe this is a bug, it might be something I am not seeing [07:43:45] I really cannot believe this is a bug, it might be something I am not seeing [07:44:38] Although after reporting this one https://jira.mariadb.org/browse/MDEV-13175 it wouldn't really surprise me [07:44:38] Although after reporting this one https://jira.mariadb.org/browse/MDEV-13175 it wouldn't really surprise me [08:07:38] I am out of ideas [08:07:39] I am out of ideas [08:18:52] sorry, I don't have any more daft questions either [08:18:52] sorry, I don't have any more daft questions either [08:18:59] haha no worries! [08:18:59] haha no worries! [08:19:12] I was basically expressing my frustration! [08:19:12] I was basically expressing my frustration! [08:19:24] I am still trying to think what I could be doing wrong, but all the config seems correct [08:19:24] I am still trying to think what I could be doing wrong, but all the config seems correct [08:50:49] so you remember the last time you tested it and worked, was it on 10.1 or on 10.4? [08:50:49] so you remember the last time you tested it and worked, was it on 10.1 or on 10.4? [08:51:30] I don't remember :( [08:51:30] I don't remember :( [08:51:51] But probably on 10.1 worked cause I did run it for commonswiki.image [08:51:51] But probably on 10.1 worked cause I did run it for commonswiki.image [08:51:59] And that was done before we migrated the masters to 10.4 [08:51:59] And that was done before we migrated the masters to 10.4 [08:52:07] my question was that if it was a recent 10.4 version, maybe we could check recent commits [08:52:07] my question was that if it was a recent 10.4 version, maybe we could check recent commits [08:52:23] I have looked in their jira, and I don't find anything related at least on bugs [08:52:23] I have looked in their jira, and I don't find anything related at least on bugs [08:52:36] I found yours :-D [08:52:36] I found yours :-D [08:52:46] Yeah, I just opened it XD [08:52:46] Yeah, I just opened it XD [08:52:57] https://jira.mariadb.org/browse/MDEV-26618 [08:52:57] https://jira.mariadb.org/browse/MDEV-26618 [08:53:12] I still think it is something I am doing wrong or misconfigured [08:53:12] I still think it is something I am doing wrong or misconfigured [08:53:55] maybe they decided to remove replication there [08:53:56] maybe they decided to remove replication there [08:54:01] but didn't document it [08:54:01] but didn't document it [08:54:11] (wouldn't be surprised) [08:54:11] (wouldn't be surprised) [08:54:36] That'd be: grrrr [08:54:36] That'd be: grrrr [08:54:47] the other thing I could think of [08:54:47] the other thing I could think of [08:54:59] I just tested on pc2011 and there it works [08:54:59] I just tested on pc2011 and there it works [08:55:01] is some kind of weird gtid issue [08:55:02] is some kind of weird gtid issue [08:55:34] is it written to local binlog? [08:55:34] is it written to local binlog? [08:55:40] yeah [08:55:40] yeah [08:56:15] But nothing relevant on the config diff between those two hosts https://phabricator.wikimedia.org/P17280 [08:56:15] But nothing relevant on the config diff between those two hosts https://phabricator.wikimedia.org/P17280 [08:56:25] which gtid mode is set on the host you run it? [08:56:25] which gtid mode is set on the host you run it? [08:56:33] Same on both [08:56:33] Same on both [08:56:34] Slave pos [08:56:34] Slave pos [08:56:46] try removing it is my only idea [08:56:46] try removing it is my only idea [08:56:55] setting it to no? [08:56:55] setting it to no? [08:57:00] yeah [08:57:00] yeah [08:57:03] let me see [08:57:03] let me see [08:57:39] it is a guess, ok? Don't expect I have a huge insight [08:57:39] it is a guess, ok? Don't expect I have a huge insight [08:57:48] I know ;) [08:57:48] I know ;) [08:57:57] At this point it is worth trying everything [08:57:58] At this point it is worth trying everything [08:58:29] my idea is "something something- out of band changes with domain 0 are not replicated while using slave_pos" [08:58:29] my idea is "something something- out of band changes with domain 0 are not replicated while using slave_pos" [09:01:08] no difference :( [09:01:08] no difference :( [09:01:18] ok, hey, I tried! [09:01:19] ok, hey, I tried! [09:01:50] the weird thing is it works elsewere but not there [09:01:50] the weird thing is it works elsewere but not there [09:02:01] I have no idea what else it could be to be honest [09:02:02] I have no idea what else it could be to be honest [09:02:16] I have tried different tables too [09:02:16] I have tried different tables too [09:02:35] And upgrading from 10.4.19 to 10.4.21 [09:02:35] And upgrading from 10.4.19 to 10.4.21 [09:02:56] But it happens on 3 hosts from s5 [09:02:56] But it happens on 3 hosts from s5 [09:03:06] (The 3 I have tried) [09:03:06] (The 3 I have tried) [09:03:12] Actually 4, cause the eqiad master didn't work either [09:03:12] Actually 4, cause the eqiad master didn't work either [09:03:16] so 100% failure rate [09:03:16] so 100% failure rate [09:04:23] let me try a different database in a different section [09:04:23] let me try a different database in a different section [09:04:32] I can actually try s1, which is 10.1 [09:04:32] I can actually try s1, which is 10.1 [09:04:43] how do you know it is not replicated? on the paste you show the alter? [09:04:44] how do you know it is not replicated? on the paste you show the alter? [09:04:59] with this I mean- maybe it gets replicated by fails [09:05:00] with this I mean- maybe it gets replicated by fails [09:05:03] *but [09:05:03] *but [09:05:06] yeah, the alter appears on the binlog [09:05:06] yeah, the alter appears on the binlog [09:05:10] but the optimize doesn't [09:05:11] but the optimize doesn't [09:05:15] and on the relay [09:05:15] and on the relay [09:05:41] ah, I see, you did the alter manually [09:05:41] ah, I see, you did the alter manually [09:05:45] yeah [09:05:45] yeah [09:06:04] and if I run the optimize on the slaves manually, it also works (no failure) [09:06:04] and if I run the optimize on the slaves manually, it also works (no failure) [09:06:12] and nothing on logs either about any possible failure [09:06:12] and nothing on logs either about any possible failure [09:06:49] so let me try two things: 1) run the optimize on a different section but for the same table and run the optimize on 10.1 (s1) [09:06:49] so let me try two things: 1) run the optimize on a different section but for the same table and run the optimize on 10.1 (s1) [09:08:21] workaround, can you do "ALTER TABLE logging FORCE;"? hopefully it would have the same effect? [09:08:22] workaround, can you do "ALTER TABLE logging FORCE;"? hopefully it would have the same effect? [09:08:33] the alter table does get replicated [09:08:34] the alter table does get replicated [09:09:07] so, on s2, the optimize doesn't appear on binlog [09:09:07] so, on s2, the optimize doesn't appear on binlog [09:09:09] let's try on s1 (10.1) [09:09:10] let's try on s1 (10.1) [09:09:25] let me know the host and the timestamp of when you do it [09:09:26] let me know the host and the timestamp of when you do it [09:09:32] for 10.1 host? [09:09:32] for 10.1 host? [09:09:44] in a location where it doesn't work [09:09:44] in a location where it doesn't work [09:10:16] ah, you can check db2104 and binlog db2104-bin.002653 (if you grep by alter table, you'll see it, if you grep by optimize you won't) [09:10:16] ah, you can check db2104 and binlog db2104-bin.002653 (if you grep by alter table, you'll see it, if you grep by optimize you won't) [09:10:35] and you did them like 8 minutes apart? [09:10:36] and you did them like 8 minutes apart? [09:10:52] no, less than that for db2104 [09:10:52] no, less than that for db2104 [09:14:11] and on 10.1 it does show up onthe binlog [09:14:11] and on 10.1 it does show up onthe binlog [09:14:15] going to report that too [09:14:16] going to report that too [09:14:43] ok, another wild theory [09:14:43] ok, another wild theory [09:15:09] do you see the effects of the optimize, or is it difficult to notice? [09:15:09] do you see the effects of the optimize, or is it difficult to notice? [09:15:32] e.g. maybe on 10.4 it changes the optimize to another sql statement [09:15:32] hehe I also thought about that, and it is easy to prove it was not done as the reduction from 54GB to 14GB didn't happen [09:15:33] e.g. maybe on 10.4 it changes the optimize to another sql statement [09:15:33] hehe I also thought about that, and it is easy to prove it was not done as the reduction from 54GB to 14GB didn't happen [09:15:44] and once done manually on the slaves it went to 14GB [09:15:44] and once done manually on the slaves it went to 14GB [09:16:10] I see, so after all of that I see it as a bug, or some super-weird configuration [09:16:10] I see, so after all of that I see it as a bug, or some super-weird configuration [09:16:36] I cannot believe this could be a bug, meaning...how can it be, it is too crazy [09:16:37] I cannot believe this could be a bug, meaning...how can it be, it is too crazy [09:16:55] but if it is a custom state/config, what? [09:16:55] but if it is a custom state/config, what? [09:16:58] And why on pc hosts it doesn't happen? [09:16:58] And why on pc hosts it doesn't happen? [09:18:32] sadly I reached the end of my "wild theories :-)" [09:18:33] sadly I reached the end of my "wild theories :-)" [09:20:45] haha [09:20:46] haha [09:23:33] you can still do the maintenance meanwhile, right? [09:23:33] you can still do the maintenance meanwhile, right? [09:23:45] yeah, but I need to review what was done in eqiad.. [09:23:45] yeah, but I need to review what was done in eqiad.. [09:23:50] as we did quite a few optimizes there [09:23:51] as we did quite a few optimizes there [09:23:56] :-( [09:23:56] :-( [09:24:04] yeah :( [09:24:04] yeah :( [09:26:36] My worry is that they won't be able to reproduce it and close it :( [09:26:36] My worry is that they won't be able to reproduce it and close it :( [09:53:39] jynus: https://phabricator.wikimedia.org/T290057#7358228 :( [09:53:39] jynus: https://phabricator.wikimedia.org/T290057#7358228 :( [09:57:40] marostegui: :(((((((((((((((((((((((((((((((((((((((((((((((((((((((( [09:57:40] marostegui: :(((((((((((((((((((((((((((((((((((((((((((((((((((((((( [09:58:16] :_( [09:58:16] :_( [09:59:49] do you need a cheerleader? [09:59:49] do you need a cheerleader? [10:00:09] hahahaha [10:00:09] hahahaha [10:00:33] I am happy this didn't affect the commons.image optimize [10:00:33] I am happy this didn't affect the commons.image optimize [10:00:37] I am _so_ glad [10:00:37] I am _so_ glad [10:00:51] joy is unconfined [10:00:51] joy is unconfined [10:00:57] ...in the lower bound [10:00:57] ...in the lower bound [10:01:00] XDDD [10:01:01] XDDD [12:53:20] jynus: mariadb has found the issue [12:53:21] jynus: mariadb has found the issue [12:53:27] oh [12:53:27] oh [12:53:40] which is pretty lame I think [12:53:40] on the ticket? [12:53:40] which is pretty lame I think [12:53:40] on the ticket? [12:53:55] yep [12:53:55] yep [12:53:57] her last comment [12:53:57] her last comment [12:54:06] https://jira.mariadb.org/browse/MDEV-26618?focusedCommentId=199592&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-199592 [12:54:06] https://jira.mariadb.org/browse/MDEV-26618?focusedCommentId=199592&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-199592 [12:54:23] this is soooooooo lame [12:54:23] this is soooooooo lame [12:56:00] that's not on "notable changes"! [12:56:00] that's not on "notable changes"! [12:56:53] yeah, it is definitely something very silly [12:56:53] yeah, it is definitely something very silly [12:57:04] and I cannot find it yet on changelog [12:57:04] and I cannot find it yet on changelog [12:57:33] that explains why it did work on parsercache [12:57:33] that explains why it did work on parsercache [12:57:40] parsercache isn't read-only [12:57:40] parsercache isn't read-only [12:57:48] and why it worked on 10.1 [12:57:48] and why it worked on 10.1 [12:58:59] can you ask for clarification on "we have fixed some issues related to read only replicas" a ticket or a commit? [12:59:00] can you ask for clarification on "we have fixed some issues related to read only replicas" a ticket or a commit? [12:59:45] I really cannot see it on changelog on that version [12:59:45] I really cannot see it on changelog on that version [13:00:03] sure [13:00:03] sure [13:00:12] not in a bad way [13:00:12] not in a bad way [13:00:13] meeting now, but will do it once finished [13:00:14] meeting now, but will do it once finished [13:00:16] as in, I want to know more [13:00:17] as in, I want to know more [13:00:23] yeah, no rush [13:00:23] yeah, no rush [13:00:25] Feel free to chime in too! [13:00:25] Feel free to chime in too! [13:03:37] it has to be https://jira.mariadb.org/browse/MDEV-21407 [13:03:37] it has to be https://jira.mariadb.org/browse/MDEV-21407 [13:09:28] So OPTIMIZE TABLE doesn't get replicated to read-only replicas? That's ... exciting [13:09:28] So OPTIMIZE TABLE doesn't get replicated to read-only replicas? That's ... exciting [13:33:19] it is sooooo stupid yeah [13:33:19] it is sooooo stupid yeah [13:33:37] And not writing that on the specific page of the doc is.... [13:33:37] And not writing that on the specific page of the doc is.... [13:35:44] I love how it has been assigned to monty now XDDD [13:35:45] I love how it has been assigned to monty now XDDD [13:35:59] For Emperor https://en.wikipedia.org/wiki/Michael_Widenius [13:35:59] For Emperor https://en.wikipedia.org/wiki/Michael_Widenius [13:37:32] This looks like a great mess if Sergei is also not understanding this change XD [13:37:32] This looks like a great mess if Sergei is also not understanding this change XD [13:39:01] :) [13:39:02] :) [13:49:38] Just talked to Sergei, he's going to double check with Monty [13:49:38] Just talked to Sergei, he's going to double check with Monty [13:53:03] You know all the right people :) [13:53:04] You know all the right people :) [13:53:26] The mysql world is pretty small so both Jaime and myself pretty much everyone [13:53:26] The mysql world is pretty small so both Jaime and myself pretty much everyone [13:53:33] pretty much know [13:53:34] pretty much know [14:08:47] so I am not too worried about behaviour changes- sometimes you don't have enough context [14:08:47] so I am not too worried about behaviour changes- sometimes you don't have enough context [14:09:12] but I 100% agree with Sergei, it doesn't seem properly documented or justified [14:09:12] but I 100% agree with Sergei, it doesn't seem properly documented or justified [14:10:11] this seems to me like a major external change, and was not even mentioned on release notes or even detaild changelog [14:10:11] this seems to me like a major external change, and was not even mentioned on release notes or even detaild changelog [14:10:40] you== I meant as "one" [14:10:40] you== I meant as "one" [14:11:39] which now that I read was exactly what you commented, marostegui [14:11:40] which now that I read was exactly what you commented, marostegui [14:12:57] :-) [14:12:57] :-) [14:13:08] I read the ticket in the inverse order :-) [14:13:08] I read the ticket in the inverse order :-)