[09:43:19] Lunch [10:05:33] lunch [12:23:09] Hi, I have a wikidata question. [12:23:09] why there are so many duplicate triples for references and values. https://wikitech.wikimedia.org/wiki/User:AKhatun/Wikidata_Basic_Analysis#Duplicates [12:23:09] I have listed some top duplicates for references (this page is in progress). Same occurs for values. I have not checked duplcates for other things. [12:23:09] Also, how are the same references used in multiple places linked together. For exmaple if I add reference from a news article to multiple objects in wikidata, would those be different or same reference ids? [12:23:09] dcausse, joal [12:24:14] tanny411: this is because of the dump process [12:24:35] in an actual triple store they would be deduplicated [12:24:57] the hash in e.g. ref:9a681f9dd95c90224547c404e11295f4f7dcf54e [12:25:32] is kept in a cache during the dump [12:26:19] so when the reference needs to be written again it checks from this cache if it was not already produced to avoid writing it again [12:26:41] that cache has a max capacity and thus duplicates occur [12:27:37] the hash is not stored anywhere in wikibase, it's generated from its values [12:28:02] this helps reuse the save ref for multiple entities that points to the same reference or values [12:28:08] s/save/same [12:28:35] ideally we should dedup those when importing the rdf into hdfs [12:29:26] https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Reference_representation [12:32:15] so when counting the size it would need in a real triple store you have to dropDuplicate those [12:33:09] I whish we could do this at import time but I worry that running a dropDuplicate on 12bil rows dataset might be prohibitive [12:35:39] Interesting, that helps. So I have been counting a lot of triples so far, total triples, number of triples per subject etc. I feel like those are not representative of the real triplestore if I had counted in all the duplicates. Where else should I be expecting duplicates, if at all except ref and value [12:36:45] wondering how to do the dropDuplicate for everything [12:38:04] tanny411: hopefully it's not that representative, given that a total count is not that far from what we see in blazegraph (which drop duplicates) [12:38:23] but if you focus on refs and values you should definitely drop duplicates [12:39:07] you might see duplicates in sitelinks but I think they are pretty rare [12:39:17] Noted [12:40:37] refs and values are easily identifiable they have a special context [12:41:01] Do you prefer I remove the duplicate list from the page, since the actual triple store doesn't really have them. yes you had mentioned the context thing before, been using that info :) [12:42:42] yes we should remove the duplicate list I don't think it brings anything useful it's just an unfortunate limitations of how we do things :( [12:43:12] Yep, okay. [12:46:27] you can also remove v:* and ref:* from the top subjects [12:47:08] obtaining most used references or values should be done through their use as an object not as a subject [12:48:37] Yes, removing them. As a subject I was intending to find the references with most info. But that has to be fixed. Thanks! [12:49:26] yes you'll have to dedup for that I think [12:50:04] Yes [13:52:26] errand [14:52:02] I'm tempted to ban elastic2054 (load is around 60), search thread pool rejections do not seem to be willing to go down [15:13:46] \o [15:14:44] o/ [15:15:07] hmm, comp_suggest traffic disappeared in the graphs when switching to codfw [15:15:15] banned elastic2054 and rejections now seems to recover [15:15:26] ebernhardson: you need to switch DC [15:15:40] dcausse: oh, of course! [15:15:45] i should make better graphs some day... [15:16:16] yes there are way too many things in there :) [15:16:30] not sure what went wrong with elastic2054 [15:17:04] i also gotta figure out how to do resilient things in spark ... overnight it downloaded 300G of 10kB thumbnails to hdfs, then crashed :P [15:17:33] looking [15:18:21] load average is up on 2054, but not really cpu usage. It's all stuck in IO wait [15:18:46] as a total guess, bad set of shards leading to too much load? Banning it seems easiest course regardless [15:19:01] work will redistribute [15:19:11] yes banning did solve the problem [15:20:57] * ebernhardson gives spark more memory, about the only knob to guess with, and starts it again ... fun! [15:20:59] :) [15:21:02] :) [15:22:02] ebernhardson: I was about to look into T258738 but wanted to make sure you did not start anything nor wanted to pick this task up? [15:22:03] T258738: Build query-clicks dataset from SearchSatisfaction logging - https://phabricator.wikimedia.org/T258738 [15:23:02] dcausse: hmm, in nothing concrete in terms of progress there, just the breakdowns in the ticket about what we think we're trying to do [15:25:31] there're only two things I'm not sure, the is_bot flag that is going to be somewhat inflexible (what threshold to take?) [15:26:14] dcausse: hmm, for the bot we can make a multi-level flag but i think what legal was worried about was us keeping per-ip address statistics [15:26:35] and the ui entry point that I'm not sure we can really determine with precision other than API vs web [15:26:48] dcausse: so, breaking ip's into 5 classes of volume for bot detection is ok, but saying this user came from an ip with an average 7.624113 queries per day is problematic [15:27:06] ok I see, makes sense [15:30:45] * ebernhardson realizes what spark docs are missing is "these are errors that spark has and what to change in configuration when they happen" [15:31:10] almost end-to-end the spark docs assume everything "just works" [15:42:22] search seems stable I'm going to unban elastic2054 to see if it was due to a bad combination of busy shards [15:43:30] sounds good [16:13:52] * ebernhardson goes back to wondering how elastic2054 got overloaded with more_like traffic not even going to the same datacenter... [16:14:16] hu? [16:14:29] eqiad seems still pretty active [16:14:51] search thread pool is filling up again [16:15:58] eqiad should have all of more_like [16:16:09] which is currently way under the typical cache hit rate [16:16:48] i guess it's not amazing though, currently doing ~380qps of more_like in eqiad [16:17:02] sure, sorry misread what you said, thought you said that elastic2054 did receive morelike traffic [16:17:25] ahh, no i mean that a significant part of our traffic wasn't even going to codfw, yet we still overloaded [16:17:44] now it's 2045 being overloaded :/ [16:17:55] hmm [16:20:04] i/o is pretty heavy, reading ~700MB/s. That's not usually the kind of thing it catches up from [16:22:00] i think we just have to keep banning nodes for now until it decides a desitribution of shards that works [16:22:12] wow, thats amazing spelling :P [16:23:59] new word accepted :) [16:29:40] hmm, any reason 2043 is banned? [16:30:26] * dcausse looks at his bash history [16:30:39] * ebernhardson is a little sad elasticsearch's adaptive work distribution doesn't work all that amazing. Optimistically it would stop sending search requests to a node that is struggling and needs to be banned [16:30:55] or maybe we never set it up right, but i thought it was all magic. Should probably check [16:32:12] dcausse: 2043 ban doesn't look new, it didn't pick up load with the rest of the cluster [16:32:22] I used the _ip ban [16:32:40] I should have checked existing bans tho [16:38:12] elastic2043 still has relocating shards, weird... [16:40:11] https://sal.toolforge.org/log/UTqtQnkBa_6PSCT9fTbK [16:40:25] ebernhardson / dcausse: `elastic2043` was out of commission for awhile, but looks like it got fixed as of june 21: https://phabricator.wikimedia.org/T281327#7166840 [16:40:27] elastic2043` is ssh unreachable. Power cycling it to bring it briefly back online [16:40:29] yes [16:40:40] we could just unban I guess [16:41:03] power cycle then unban when it's back up and healthy sounds good to me [16:41:10] a little worrisome it was already ssh unresponsive though [16:41:27] is it all updated? Usually we re-image a node when re-adding to cluster [16:41:27] unbanning will also desitribute :P [16:41:32] lol [16:42:32] ebernhardson: yeah wouldn't hurt to re-image, failure was non disk related but to your point generally we reimage everything we add back in for peace of mind [16:43:31] Let's keep `elastic2054` and `elastic2043` both banned; I'll re-image 2043 and add it back when that's done [16:43:47] ryankemper: i suppose the only particularly bad thing i can think of is wrong plugin versions, in general should be fine. But easiest if everything starts from the same place, keep the variables smaller :) [16:44:07] sounds good [16:44:33] I'll handle the banning [16:45:15] err, it should be 2043 and 2045 as banned, 2054 was banned earlier but should be unbanned now [16:46:13] search thread pool dropping, looks like 35 data nodes is not a good number [16:48:14] this is worrying :S A few years ago we could flip all the traffic, including more_like, without issues. I guess we have only barely expanded the cluster in that time, maybe we need more agressive overprovisioning to handle these one-time events gracefully. I guess the new wd+commons cluster does just that [16:48:41] Yes we're pulling forward some budget (implicitly) for the new cluster so we should be overprovisioned once those get in service [16:50:04] Okay I briefly had `2054` and `2043` banned on 2/3 of the clusters, but now it's `2043` and `2045` on all 3 clusters [16:50:45] sounds good. Should be safe to unban 2045 once it's load drops sufficiently. Probably 30-ish minutes [17:29:27] Need to hop into interview for next hour but will check the load after (or if we're really hurting for capacity feel free to do it now, we want to leave 2043 unbanned and nothing else assuming no other nodes are struggling with a heavy shard) [17:58:11] dinner [19:17:29] while eating lunch i was thinking since we are seeing massive IO, should check whats bitten us several times before: agressive readahead and wasted pages in the file cache. 2054 is currently struggling with 250MB/s IO, Running my script to turn off readahead dropped IO to 5 MB/s [19:17:46] so, maybe instead of futzing with readahead so much, we just turn the thing off [19:19:20] ryankemper: ^^ [19:19:54] ebernhardson: I like the approach [19:19:57] This is basically T169498, and then T264053. Still not fully resolved :S [19:19:58] T264053: Unsustainable increases in Elasticsearch cluster disk IO - https://phabricator.wikimedia.org/T264053 [19:19:58] T169498: Investigate load spikes on the elasticsearch cluster in eqiad - https://phabricator.wikimedia.org/T169498 [19:28:55] ebernhardson: how do we properly disable readahead? if we just set `read_ahead_kb` to 0 here: https://gerrit.wikimedia.org/r/c/operations/puppet/+/632319/3/modules/profile/manifests/elasticsearch/cirrus.pp does that do it? [19:29:15] ryankemper: i have no clue :) [19:29:23] :P [19:29:26] ryankemper: i suspect our existing readaheads aren't all being applied properly [19:30:30] i never remember which applies, but sd{a,b} are reporting 256 readahead, and md2 which is the soft raid reports 4096. (looking up what units those are...) [19:31:19] should be 512 sectors, so the readahead is either 2MB or 128KB, but i thought we set it to 32 or some such? [19:31:32] oh, on elastic2054. I didn't check others [19:34:06] according to my local puppet we tried to set it to 16kb, without looking too deep the $storage_device is just a value we provide in hiera and sometimes changes when we buy a new batch of servers [19:45:37] i suppose for clarity, the way my program shuts off the readahead isn't really proper. It uses some special sauce to issues syscalls to the kernel from inside the elasticsearch process. For various java-ish reasons elasticsearch isn't able to do the same syscalls [21:36:35] ^ gotcha, thanks for the context [21:37:16] finished up today's last interview, gonna grab a late lunch and then work on the readahead stuff. first up is to figure out if the current readahead stuff we're doing is actually working [21:37:31] sounds good, thanks!