[07:55:40] Likely not making it to search retro tmrw, unless appt ends early. Will be around after [08:49:04] ryankemper: ack [10:29:50] errand+lunch [14:11:55] o/ [16:10:13] inflatador: all work in codfw rack b7 done, you can un-ban those elastic hosts when you get a moment [16:10:19] thanks as always for the help :) [16:12:24] topranks np, thanks for the update! [16:41:54] inflatador: figured out T358541, the cloudelastic cross-cluster seeds need to be updated with the new host list in cluster state (probably all 3 clusters) [16:41:54] T358541: 400 - Bad Request on any Global Search - https://phabricator.wikimedia.org/T358541 [16:43:04] ebernhardson oops, will get started on a patch for that one [16:53:37] does anyone know if it's possible to use the host's role in `if` statements in Puppet? I'm trying to make separate checks for public and internal hosts. Re https://gerrit.wikimedia.org/r/c/operations/puppet/+/1006564 [16:54:05] inflatador: iiuc typically that is done by setting a hiera value on way for internal, and one way for public [16:54:24] so its not by host, but by role or profile [16:56:26] ebernhardson yeah, it just seems weird to put a hiera value that says "I'm a public host" in a hiera file for public hosts [16:56:48] but I guess it would get messy if you had hosts with multiple roles [17:00:24] Would it work to lookup the nodes from public.yaml, maybe like this? https://paste.opendev.org/show/823332/ [17:01:09] inflatador: yes, you should be able to look at a value that only exists for public hosts [17:03:48] do you need the list of hosts or check if the current host's catalog role? [17:05:06] wmflib::role::hosts('foo::bar') for the former, $::_role for the latter [17:06:32] if the former and your host is part of foo::bar you need to add it to the list to make the first puppet run work the same,something like: [17:06:42] $cumin_masters = (wmflib::role::hosts('cluster::management') << $facts['networking']['fqdn']).sort.unique [17:08:31] volans I think the latter would work . Does this look OK? https://paste.opendev.org/show/823333/ [17:11:06] that's out of context, do you have a patch? a value in hieradata/role/common/wdqs/public.yaml might be better [17:11:31] like enable_public_monitor [17:12:36] the if standlone like this seems a bit brittle, it might get missed if/when renaming the role [17:14:16] volans yeah, that's what Erik was suggesting as well. I can make a hiera value, I was just hoping there was enough there already to avoid adding another var. NBD either way though [17:18:11] just out of curiousity though, if I do a lookup for 'profile::query_service::nodes::public' from modules/profile/manifests/query_service/wikidata.pp , will Puppet actually find hieradata/role/common/wdqs/public.yaml ? [17:18:36] inflatador: it kinda goes the other way around [17:19:34] inflatador: so puppet will read the hiera.yaml, prod version is modules/puppetmaster/files/hiera/production.yaml. That will expand into a bunch of paths to read. It will read all of them into a shared namespace, then lookup can query that namespace [17:20:11] so lookup doesn't exactly find the files, the files are pre-determined by the hiera.yaml. [17:27:46] Ah ok, that makes it a little clearer. But how do we get to `hieradata/role/common/wdqs/public.yaml` from the production.yaml? Is "wdqs" considered "site" and "public" considered "role"? [17:32:18] o/ [17:34:57] volans patch is up at https://gerrit.wikimedia.org/r/c/operations/puppet/+/1007653 if you wanna take a look [17:39:03] that didn't work..adding new hiera vars [18:22:57] OK, https://gerrit.wikimedia.org/r/c/operations/puppet/+/1007653 is ready for review if anyone has time to look [18:23:49] lunch, back in time for pairing [19:10:15] back [19:32:15] ryankemper we're in pairing if you wanna join [19:32:28] oh, brt [20:17:31] quick errand, back in ~20 [20:31:02] does anyone thing we still need "cloudvirt-wdqs" machines? They're up for refresh [20:32:08] not sure [20:34:50] I've never touched them...I vaguely recall they were going to be used for graph splitting(?). Should we ask WMCS/ [20:34:51] ? [20:47:03] inflatador: T324147, T221631 (ref pointed at https://wikitech.wikimedia.org/w/index.php?title=Wikidata_Query_Service/ScalingStrategy&action=history) [20:47:04] T324147: Investigate and document cloudvirt-wdqs servers - https://phabricator.wikimedia.org/T324147 [20:47:04] T221631: Dedicated servers on WMCS to test WDQS scalability strategy - https://phabricator.wikimedia.org/T221631 [20:48:53] I doubt it was ever used since we never started evaluating blazegraph alternatives [20:51:03] and it's not related to the graph splitting work [20:51:04] dcausse ah, thanks for the links. Sounds like we don't need new ones then [20:52:30] I don't think so [21:07:15] Those are the real servers backing some of the VMs in WMCS that we've used in the past for various testing. We haven't used them for a long time, so no objection to removing them. [21:07:58] And these days, we might have some more leeway to test in the production network, depending on the kind of tests we want to do. [21:08:19] But the Blazegraph replacement is unlikely to be for next FY [21:15:59] ACK, sounds hood [21:16:01] or good [21:23:54] dr0ptp4kt can you link your wdqs benchmarking task to T358727 and T358533 ? Just trying to align with some dc-related tickets [21:23:55] T358727: Reclaim recently-decommed CP host for WDQS (see T352253) - https://phabricator.wikimedia.org/T358727 [21:23:55] T358533: Hardware requests for Search Platform FY2024-2025 - https://phabricator.wikimedia.org/T358533 [21:31:44] inflatador: i always get confused by the parents and subtasks, but i set the cp one T358727 as a subtask of T336443 investigate performance differences. and then i set one of T358727's parents as T358533: Hardware requests for Search Platform FY2024-2025. i've been using the etherpad you created in T336443 as a place to write down observations. i have a local text file where i'm journaling stuff, and have stuff to copy up [21:31:45] T358727: Reclaim recently-decommed CP host for WDQS (see T352253) - https://phabricator.wikimedia.org/T358727 [21:31:45] T336443: Investigate performance differences between wdqs2022 and older hosts - https://phabricator.wikimedia.org/T336443 [21:31:46] T358533: Hardware requests for Search Platform FY2024-2025 - https://phabricator.wikimedia.org/T358533 [21:32:19] i'd like to do a blog post on this (maybe phame, maybe techblog, dunno) eventually [21:33:29] dr0ptp4kt thanks, I've been getting overwhelmed myself ;) [21:36:38] dr0ptp4kt did you get pricing info for the NVME drives? We'd need at least 1.5T of capacity, RAID-0 should be fine that's too much for a single NVME [21:36:53] we can always ask DC Ops if not [21:39:05] i didn't. the pricing tix for the new cp hosts and the old cp hosts may allow one to triangulate, although yeah, i think we'd need them to quote [21:40:03] OK, could you ping them if you don't mind? Just hit me up or add to the etherpad when you get the quote [21:41:45] on it. i think we gotta keep the quote on a ticket with an acl on it. [21:46:02] Y, I think price is sensitive. If you just wanna DM it to me or something that's cool too