[07:13:20] dcausse: coincidental timing, was just about to post the latest stack trace for ya [07:13:23] https://www.irccloud.com/pastebin/m9jWlqS7/ [07:13:39] lol :) [07:13:41] thanks!! [07:15:43] Logs in `./cookbooks_testing/logs/sre/wdqs/data-reload-extended.log` but they're `root:root` so you won't be able to see...unless maybe they get shipped to logstash [07:17:00] np, will use the patch you already uploaded to make some fixes [07:19:48] cool. once we've got it all working properly I will need to do a lot of commit message cleanup :D [07:31:45] indeed! going to be a lot more fixes I'm afraid... [08:41:32] dcausse: do you need help to access those logs? [09:21:24] gehel: I have them thanks! [10:07:48] lunch [13:08:59] o/ [13:50:54] dcausse can't make pairing [14:00:53] inflatador: np [14:08:46] had to drop off my son...back now [14:34:33] cindy keeps failing...but it's not failing the same test :P [14:59:28] will be a couple mins late for the office hours [15:27:49] stepped out of search platform mtg...working on some helm chart stuff ATM [15:53:57] ebernhardson: I might need your help to fix the build on https://gitlab.wikimedia.org/repos/search-platform/mjolnir/-/merge_requests/8 [15:55:20] gehel: looking [15:55:45] ahh, its the backports thing. I can fix that later [16:05:40] workout, back in ~40 [16:50:06] back [18:08:22] ebernhardson: current data-reload failure https://www.irccloud.com/pastebin/uYoGGgWh/ [18:08:50] ryankemper: suggests target_paths is a path when it was supposed to be a list of paths [18:08:52] I need to look at transferer's implementation but I wonder if it's as simple as needing to pass `[tmpdir]` instead of `tmpdir` [18:08:58] right [18:09:38] ryankemper: it says the call site is line 456 in _transfer_dump, probably we need to pass an array as one of the params. checking which [18:10:43] So probably line 457 final arg needs to be changed to `[self.query_service_target_parent_dir]` [18:11:35] probably, looks likes thats operations/software/transferpy. checking it out now [18:12:16] ryankemper: yes, the third and fourth arguments are lists [18:13:11] Alright, kicking off another run with that change [18:13:16] the lists have to be the same length, and it's assumed that the indexes match, so host 0 goes with target 0 [18:13:26] kinda odd to accept that way, python has tuples :P [18:13:50] but there is probably some reason, i haven't looked deeply [18:14:45] ebernhardson: Oh actually in my last patchset I made it `[self.query_service_target_parent_dir]` but does it need to be `[self.query_service_target_parent_dir] * len(self.query_service_host.hosts)]`? [18:15:03] er sorry, `[self.query_service_target_parent_dir * len(self.query_service_host.hosts)]` [18:15:39] oh wait i'm probably assuming wrong what * does in that case it might just concat the same string multiple times [18:15:48] ryankemper: looks like if you want the same path on each host, then yes. mostly due to : https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/transferpy/+/refs/heads/master/transferpy/Transferer.py#379 [18:16:21] thats in the pre-check, they do the same zip later during execution [18:16:22] final answer is `[self.query_service_target_parent_dir] * len(self.query_service_host.hosts)` based off repl results [18:16:56] yea looks reasonable [18:21:30] ryankemper: oh, actually thats a bit odd [18:22:00] ryankemper: on 457 it has `[str(self.query_service_host.hosts)]`, that seems to be creating a 1 element array of all the hosts stringified? [18:22:19] i suspect its more like [str(host) for host in self.query_service_host.hosts] perhaps? Not clear exactly [18:22:39] good catch, that sounds right to me [18:26:54] I always have an awkward time looking through this, but query_service_host is a `RemoteHosts` instance, .hosts is a list of NodeSet instances. NodeSet comes from cumin, and __str__ is documented as "Get ranges-based pattern of node list." [18:27:06] so, if this works its because the NodeSet was instantiated in a particular way [18:29:42] hmm so the __str__ documentation you quoted implies it might give us something like `wdqs100[3-5]` (arbitrary example) [18:30:08] yea, i suppose the main point was the a NodeSet is not strictly 1 host. But it might be in this context [18:30:35] https://clustershell.readthedocs.io/en/latest/api/NodeSet.html#usage-example oh so if we iterate one level deeper we can get guaranteed one host [18:31:19] yea that seems reasonable [18:31:30] hmm, trying to figure out if there's a list comprehension way to do this or if i need to use a for loop [18:32:06] you can stack them: [str(node) for host in self.query_service_host.hosts for node in host.nodes] [18:32:44] of course you can do that! [18:32:45] https://xkcd.com/353/ [18:37:50] no need to list comprehension [18:37:54] >>> a = NodeSet('wdqs100[3-5]') [18:37:58] >>> list(a) [18:37:58] ['wdqs1003', 'wdqs1004', 'wdqs1005'] [18:38:10] when you iterate a nodeset it yields strings [18:38:25] nice [18:38:40] and here I was about to ask if `[str(node) for node in [nodeset for nodeset in self.query_service_host.hosts]]` was actually what i wanted [18:39:06] we have a list of nodeset instances so why don't we still need 1 list comprehension? [18:39:33] I didn't read the whole scrollback, what's the gist of what you're trying to do? [18:39:36] oh wait `[nodeset for nodeset in self.query_service_host.hosts]` makes no sense that's just the list [18:39:52] we have a RemoteHosts instance, and need to pass a list of hosts to transferpy [18:39:54] volans: for a RemoteHosts instance we want a flat list of strings, one str per node [18:40:11] list(my_remote_hosts.hosts) [18:40:36] much better :) [18:54:51] ebernhardson: are the CirrusDumps driven by Airflow? [18:55:19] gehel: yes and no :P There are two different dumps, the hdfs dumps are airflow driven. The other is systemd timers on snapshot host [18:55:53] but the implementation is python, right? Or is there still a dependency on mediawiki? [18:56:01] about the snapshot hosts [18:56:41] the airflow one is all python, it detects everything out of the elasticsearch api's [18:56:57] gehel: the one on snapshot hosts is php and bash loops [18:57:19] and it's actually like 7 timers or something, 1 per mediawiki database group [18:57:54] ack [18:58:25] the search abandonment metric we have, is it for a single view of SERP that does nt lead anywhere? Or for a search session? [19:02:06] gehel: session based [19:02:44] essentially number of sessions that registered > 0 serp's and == 0 clicks [19:04:01] thanks! [20:02:16] sigh..for some reason running the same container as CI locally gets different results :S [20:50:58] ryankemper picking up my son, will be ~10-15m late to pairing [20:51:51] inflatador: no worries i'm setting up utilities for the upcoming lease so won't free up until :20 or :25 [21:04:22] hmm, so i think i reproduced the problem. But i'm not sure what i did :P But now the mjolnir jar has no .class files in it, and my error matches CI [21:11:08] * ebernhardson wonders if something is not working right with the scala bits :( [21:15:03] back [21:26:28] inflatador: 3m [21:26:34] ACK [21:39:57] ebernhardson: are you doing any cloudelastic reindexing? we need to do some reboots [21:41:02] s/need to do/are doing/g ;P [21:41:31] ryankemper: nope, should all be done [21:41:53] well not just should, it's certainly done :) [21:53:40] we're red again on cloudelastic chi, but it should clear up once cloudelastic1005 finishes rebooting [21:56:58] yep, back to yellow