[08:06:21] jynus: are you using db1133 for anything? (mariadb::shard: 'test-s1') I want to take it and move it for test-s4 to test orchestrator recoveries (https://phabricator.wikimedia.org/T322993) [08:06:32] moritzm: I am finished with cumin1001 btw [08:10:29] excellent, then we should be all set for the reboot later [08:21:23] marostegui: not at the moment [08:21:38] jynus: cool, it shouldn't take long (last famous words) [10:06:59] moritzm: let me know when reboot is done as I want to upgrade a package there [10:07:54] you can go ahead, the cookbook is still active, but that's only because it's waiting for the keyholder to be fully armed again (needs netops to do it for the homer passphrase) [10:08:10] thanks, I will just do an apt install [10:08:11] but Cumin etc. can be used already [10:10:24] there is a long list of upgradable packages there- I will only upgrade mine [11:22:59] https://phabricator.wikimedia.org/P40749 this makes me cry [11:51:14] "just use GTID"- they said [11:54:42] <_joe_> marostegui: LOL wth is that [11:55:05] _joe_: your worst nightmare [11:55:34] <_joe_> your worst nightmare, I'd say :P [12:02:51] _joe_: I'd call it the definition of technical debt, except it is not create by but, but by someone else and cannot be removed 😭 [12:03:02] *created by you [12:03:40] <_joe_> I mean, I get it is complicated if you're doing multi-source replication [12:03:52] <_joe_> which is in itself evil [12:04:03] <_joe_> but otherwise, "how hard can it be" [12:55:12] At least FLUSH BINARY LOGS DELETE_DOMAIN_ID (which was implemented after we filled the bug years ago) seems to be working nicely [12:55:31] oh, that's great! [12:55:40] Yeah, so far it seems to be working [12:55:45] But it is scary to run it [12:56:19] please document it on wikitech somewhere, even if it with a big //TODO / to be tested [12:56:31] it would impact my work on automation of recoveries too [12:56:38] yeah [12:56:39] Template: WCPGW? [12:56:52] I am still testing it on the test cluster to recover things with GTID via orchestraror [12:57:01] Emperor: no idea what that means XD [12:57:15] "what could possible go wrong" [12:57:19] What Could Possibly Go Wrong [12:57:19] ah [12:57:20] *bly [12:57:21] XD [12:57:41] the fact that it exists is already a great step [12:57:42] Then yes, not running that in production any time soon XD [12:58:03] jynus: yeah, it was the workaround they implemented after the whole mess with gtid+multisource we found years ago [12:58:52] s/mess/whole fundamental design model and architecture problems/ buy yes [12:58:57] *but [12:58:59] XD [12:59:20] https://jira.mariadb.org/browse/MDEV-12012 wow, it was 2017 [12:59:30] I didn't remember it was that long ago [12:59:38] what is your plan with orchestrator, btw (just curious) - gtid, binlog, pseudo-gtids? [12:59:48] I would like to stick to gtid if possible [12:59:53] If not, binlog [13:00:35] whatever works, I would say at this point :-D [13:00:39] haha yeah [13:01:57] did you see our convo on -sre about hashing? [13:02:21] I think that shouldn't affect dbas much, that I can think of [13:02:41] but for us backups and media storage could have a lot of impact on performance [13:02:51] (in a good way) [13:46:24] heh, using FLUSH BINARY LOGS DELETE_DOMAIN_ID is the way to go if we want to let orchestrator recover things via GTID from what I am testing now [13:46:28] I send a summary later on the task [13:46:40] but looks promising......we only have to go and clean it up _everywhere_ XD [13:47:13] But otherwise orchestrator will pick the first domain_id from https://phabricator.wikimedia.org/P40749 which might or might not be the current master [14:21:08] :-( [15:02:59] marostegui: db1132 in s1 has both /srv/sqldata and /srv/sqldata.s1 (while it's not multiinstance), that's confusing my poor restart script. I guess the .s1 one is from a transfer/clone. shall I delete it? [15:03:54] might have been a left over yes [15:04:01] can be deleted yes [15:04:09] awesome [15:04:29] I depool before deletion, just in case [15:05:20] I hope I can get to automating provisioning and healing next Q [15:08:01] the directory is last written in March, looks good for removal [15:10:36] https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=db1132&var-datasource=thanos&var-cluster=wmcs&from=1669215583633&to=1669216222447&viewPanel=28 xD