[07:21:33] greetings [10:21:45] hello! [10:22:12] o/ [10:22:16] taking it slowly today as I worked many extra hours over the last 2 days :) [10:22:40] I'll try to set up the new tools-db host and get it in sync [10:22:59] T409287 [10:23:00] T409287: [toolsdb] Destroy tools-db-4 and create new host - https://phabricator.wikimedia.org/T409287 [10:40:55] and of course there's a snatch: I thought I could upgrade to trixie, but mariadb 10.6 is only available on bookworm :( [10:41:26] so I have to start over and stick with bookworm for now [10:44:00] but maybe there are some good news: it looks like recent versions of Cinder improved the snapshot handling [10:44:18] I can now delete a snapshot even if there are volumes derived from that snapshot [10:44:26] so that means I can simplify the procedure and remove the need for rsync [10:46:25] neat [12:30:27] tools-db-7 is up&running, the Cinder improvement means creating a replica is now WAY faster, big win [12:30:35] I updated the procedure on wikitech [12:30:42] \o/ [12:30:55] this was the relevant cinder change if you're curious https://review.opendev.org/c/openstack/cinder/+/835384 [12:31:18] I already noticed it a few weeks ago, but I now verified it works as expected even for big volumes [12:33:08] I'm still trying to understand what RBD is exactly doing under the hood, I think deleting the snapshots triggers a "flatten" operation, which I expected to take longer on big volumes, but was very fast. It's possible it's async and is still happening in the background. [12:33:53] "rbd snap ls" is no longer showing the snapshot [12:43:08] very cool re: faster replica creation [14:17:14] dhinus: just double-checking, there's no reason for the clouddb hosts to go in official cloud racks, right? Looks to me like the existing hosts are just on normal private networks (e.g. https://netbox.wikimedia.org/dcim/devices/2923/interfaces/ ) [14:18:32] andrewbogott: I'm not sure I have a good answer, on one hand I would like for them to be official cloud hosts in official cloud racks, but at the moment they're not completely [14:19:29] Since they have to talk to prod databases it seems like it would be awkward either way. [14:20:10] topranks, taavi, either of you want to make a pitch for rethinking the network setup for clouddbs? I'm about to ask them to be racked same as the old ones otherwise. [14:20:39] worth noting that we're about to replace _all_ clouddbs with new hosts in the next few months [14:20:51] so if we want to change something, it's a good time [14:20:58] yeah. four of the new ones are already racked but we could move 'em while they're still empty. [14:21:59] marostegui is going to fill the new ones soon T408692 [14:22:00] T408692: Set up replication on new hosts clouddb102[2-5] - https://phabricator.wikimedia.org/T408692 [14:22:56] given that the clouddb hosts are responsible for filtering out some private data with views that can't be filtered during the replication phase, to me it seems clearer to keep them outside the cloud realm. in this scenario the firewall hole only allows access to the clouddbs that serve the redacted view, instead of letting the private data itself [14:22:56] through the firewall to be replicated to hosts inside our realm [14:24:13] that sounds right to me [14:24:23] yes I agree [14:24:33] ok, so, status quo. Thanks taavi, good point. [14:24:39] another option might be to reconsider the naming and remove the "cloud" prefix but that's only cosmetic [14:24:55] and they're half-owned by cloud anyway [14:25:23] i don't also see any problem with the current name that'd be worth the effort to change [14:25:56] yeah the only reason would be to avoid confusion as to why they are called cloud* but they're not connected to the cloud network [14:26:11] but it doesn't feel strong enough [14:26:36] since four of them are already named and racked it's a lot easier to leave things as is unless we have strong reasons [14:26:54] yeah let's just leave everything as is [14:27:00] thanks andrewbogott for asking anyway [14:56:43] quick review (tofu no-op): https://gitlab.wikimedia.org/repos/cloud/toolforge/tofu-provisioning/-/merge_requests/101 [14:57:29] +1 [14:57:59] thanks. this makes tofu plan clean again. the follow-up MR #102 requires some tfstate mangling [15:42:11] I posted the next steps for tools-db in https://phabricator.wikimedia.org/T409287#11354133 [15:43:13] I will continue this work on Monday, if anything happens on the weekend (hopefully not!) you can ping me / page me [15:58:08] * dhinus calling it a week