[10:10:03] what's going on with db1225:3313. It has been down for days [10:10:47] Amir1: T336326 maybe? [10:10:48] T336326: db1225 crashed (CPU 1 machine check error detected) - https://phabricator.wikimedia.org/T336326 [10:11:08] sigh, I swear I searched in my inbox [10:11:18] oh, no, that's been closed for ages, sorry [10:11:27] yeah, that's from May [10:12:09] My guess is that it's part of migration of s6 to 10.6 but it's multiinstance and the s3 instance has not been brought back online [10:15:30] Amir1: most likely yes [10:15:33] I probably forgot [10:15:50] Although I am not sure [10:16:10] Ah no [10:16:18] That is the backup source [10:16:22] That jaime was working on [10:20:36] no rush or anything like that [10:21:07] I'm starting a schema change on s3, it'll reach db1125 there in a couple of days [11:53:36] I have cleaned up all the hosts in orchestrator to reflect the situation after the changes required on the backup sources for the 10.6 migration [12:03:25] Amir1: this explains all changes produced: https://phabricator.wikimedia.org/T334650#9047479 [12:03:37] in theory zarcillo should reflect the ending state [12:55:02] marostegui: sorry, how to flush the binlog? [12:55:09] (to rotate to a new one) [12:55:11] flush binary logs; [12:55:17] ah thanks [14:22:20] Amir1: I just noticed db1196 is sanitarium master, please change it back to RBR [14:23:50] oh no :( [14:23:52] okay [14:24:23] sigh, how did I miss that [14:24:36] we really need to have some criteria to easily pick a candidate master [14:24:44] and then make it automatic [14:25:16] make sure to also revert the dbctl and orchestrator tagging (if it was already done too) [14:26:08] nah, I'm too lazy. haven't done it yet [14:26:54] changed it back to RBR