[14:59:49] just got back from the eye doctor and my eyes are still dilated. going to take a break to look at the wall or something. might not be back in time for games [15:03:47] \o [16:06:18] azul [16:10:38] azul looks interesting! but it probably also requires more than a 3 minute video to understand. [16:10:50] lol, yea probably :) [16:12:05] BTW, thanks for setting up 7 wonders, zpapierski ! It has great reviews, so I'm sure it's a great game, but I'm definitely a "read all the rules first" kind of guy. [16:15:15] huh, was looking for deprecation warnings from the elastic cluster since i turned all the logging back up. Instead found: java.nio.file.FileSystemException: /srv/elasticsearch/production-search-psi-eqiad/nodes/0/_state/global-377.st.tmp: Read-only file system [16:15:41] only for ~30s then it fixed itself...wonder what that was though [16:16:33] ryankemper: elastic1039 is on it's way to being wedged, logging in throws a bunch of Input/output errors. Going to ban [16:19:44] hmm, 1039 is already in the ban list for :9200 and :9600.... [16:22:19] actually, :9600 banned elastic1039-...-codfw, fixing [16:27:43] ryankemper: not clear if https://phabricator.wikimedia.org/T286497 should be reopened, leave it up to you. dmesg looks like the disk has failed again, since this looks like a replacement disk maybe its the controller? dunno [17:26:10] ebernhardson: taking a look. FWIW when j.clark stuck the replacement disk in he was saying that he didn't see a light that he'd expected so that there might still be issues, so probability is pretty high that it really is failed [17:26:27] oh and thanks for catching that 9600 error...must have been a bad copy-replace [17:27:52] tbh that machine is already just about EOL. Best course of action might be to delete :) [17:29:25] ebernhardson: I think so too it was going to be decommed very soon anyway [17:30:56] ryankemper: yea someone mentioned it on the ticket already. Since its failling again seems appropriate [17:37:26] ebernhardson: re: random seed, yeah I saw user_random, neat! my current use case would be sampling, though. [17:38:34] tgr: sampling how? Mostly i guess i'm wondering that the random sort takes two parameters, one for the document and one for the user, which seeds the per-document generator. I'm a bit worried the _seq_no that we use by default won't match various use cases [17:39:25] Basically with _seq_no as part of the random seed the result set is constantly changing, but slow enough it shouldn't be too noticable. Unless doing something like an article of the day that should stay constant :) [17:39:40] we should be able to seed it with something more constant like the page_id, but there are perf implications we have to look into [17:40:18] I want to find out what percentage of a search result set has some property but it's larger than 10K so I can't iterate through it. [17:40:45] So I want to iterate through a random subset instead. [17:40:50] tgr: hmm, you can issues the same query with elasticsearch scroll to cloudelastic in wmf cloud if it's only about getting some one-time stats [17:40:56] This is a maintenance script so user_random wouldn't work. [17:41:12] if its for a user feature though, i would worry that regularly serving 10k result sets is very expensive [17:41:22] hmm, ok [17:41:32] I want to do it regularly and statsd the results. [17:41:43] tgr: could elasticsearch aggregations give you the overall results? [17:42:00] they aren't easily exposed, but for a maint script it wouldn't be too hard to wire up [17:42:26] I'm not sure that that is, but the data I'm checking is not in Elasticsearch, so probably no. [17:42:36] ahh, indeed then no [17:43:22] The context is that for Add Links we have a search index for the tasks and a database table and there are consistency issues so I want a "what percentage of search index entries have database entries" check. [17:44:15] Random seeds did come up a couple times in the past as something that would be nice to have, though, so I think there would be multiple uses. [17:47:28] tgr: hmm, well in terms of how typically we would use SearchEngine::setFeatureData I suppose, this is for passing engine-specific flags [17:47:43] for my current use case, _seq_no should be fine if I understand the docs correctly. Iterating through a large result set would be slightly erratic but as long as it only affects a tiny fraction of the result, it shouldn't matter. [17:48:01] for this particular use case that should be easy enough, some feature data that provides the seed. Making that exposable via api or some such I'm less sure about how [17:49:02] i suppose the current random sort could check seed and apply it [17:50:08] I wouldn't need the API, but in general, I imagine the way to go would be to provide a features parameter which takes JSON-ified values [18:07:11] anyway thanks for the advice! I'll see how far I get. [18:11:45] gl! I was poking in cirrus and it's not exactly clear how we would get the data from SearchEngine to the actual place we build sorts...but maybe david will think of something :) [18:12:57] the easy hacky was is to drop it into the SearchContext object, it's basically a dumping ground of random context information, but that's also why we try not to expand it :) [19:09:25] hm [19:09:27] https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/92ad47f7c93a4a960d5ac8a9bb6441440ca7712b/includes/Search/SearchRequestBuilder.php#L132-L136 [19:09:36] is that because of performance concerns? [19:12:50] or just because it wouldn't behave as expected, which does not apply when specifying a seed [19:12:56] tgr: in terms of it's relation to random search, not really. Mostly because an offset there is non-sensical [19:13:16] it implies the user was expecting something other than what can be offered [19:13:46] there is some perf, but its the same for anything using offsets [20:02:18] tgr: i was just thinking, for your use case there shouldn't be any pagination. You can request 10k in a single request or paginate up to 10k, might as well ask for all of it at once [20:03:24] wouldn't it be likely to run into a memory or DB query size limit somewhere? [20:04:17] tgr: in elastic the limit is always 10k regardless, the cost is approximatly the same to get 10k in one request, or have one request skip the first 9900 and return the last 100 [20:04:31] e.g. ISearchResultSet::extractTitles adds all results to a LinkBatch [20:05:23] tgr: this should run the same: https://en.wikipedia.org/wiki/Special:Search?search=a&fulltext=1&limit=10000 [20:05:42] (it takes awhile, thats about the most expensive a query can get without going into regex and such :P) [20:06:09] i'm not entirely sure about linkbatch exactly [20:07:57] hm, using a batch size of 10000 seems to create no problems in production [20:08:37] it does seem that most of the cost there is on mediawiki side, elastic runtime is a fraction of the runtime for the large response set, but not sure if that's db stuff or just all the rendering [20:25:57] :q [20:27:57] https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/712994 is the patch I came up with [20:28:17] there is probably not much point to it if no pagination is preferred, though [20:30:41] tgr: I guess you can paginate if it causes problems elsewhere, but for elastic it shouldn't care. Doesn't seem to be any harm to landing it [21:47:38] ebernhardson: still around? Quick question.. I *thought* `resubmit` would get jenkins to try the gate-and-submit job again, but it isn't working. What have I forgotten? [21:48:03] Trey314159: hmm, usually i remove all the +2's and then re-+2 it. I think there is something with resubmit but i dunno how it works [21:48:19] I'll try that.. thanks! [21:52:36] "recheck" will re-run the pre-submit tests, but I think that the remove/add +2 dance is the only way to retry a merge. [23:09:28] I think "check php" will run some of the gate-and-submit checks although not necessarily all.