[09:00:18] dcausse: I'll be 2' late [09:00:22] np! [10:25:57] lunch [12:32:10] CirrusSearchJVMGCYoungPoolInsufficient is complaining on elastic1083@psi but it does not look to be struggling... [13:04:14] o/ [13:07:36] gehel I cancelled our Weds mtg as I'm meeting with ServiceOps, if you want to grab me anytime this wk LMK [13:07:49] inflatador: sounds good! [13:08:09] how is it going with service ops? [13:09:04] Still getting acquainted, but it's been fun so far [14:45:03] hmm, 1083's available memory does look fairly minimal, young pool went from 1+gb variance down to ~200MB over the last couple hours, young gc timing is also up. but it doesn't seem to have triggered an old GC recently [14:51:01] yes... I wonder if it's because it's idle? pulled out some gc logs but did not see anything weird [14:53:07] but we just collect a couple of hours of logs so could not see the difference with the young gc was more active [14:54:12] hmm, so maybe alerting on the young pool isn't go to work out [14:59:10] also, those 20k jvm gc log files are amusingly short :) [14:59:15] yes... [14:59:17] logs are here https://people.wikimedia.org/~dcausse/production-search-psi-eqiad_jvm_gc.2317464.log [14:59:30] and I analyzed them with https://gceasy.io/ [14:59:39] but nothing very conclusive [14:59:58] for log size I uploaded a patch to increase them I think that was a mistake when moving java11 [15:02:21] we might pull some of the GC settings from the 7.10 jvm.options file, i'm not sure we updated them. It looks like they are using ConcMarkAndSweepGC for jvms 8-13 with a couple flags (:-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly), in 14+ they use g1 [15:03:21] particularly the occupancy fraction might be expected by the circuit breaker [15:06:36] I'm all for switching back to upstream defaults [15:09:51] seems reasonable, i'll align the two files and we'll have to do some cluster restarts [15:16:33] hmm, supposedly the default value for occupancy fraction is 92%, so pushing down to 75% would certainly ask for more space [15:21:53] although UseCMSInitiatingOccupancyOnly seems to mean actually start GC when you get to 75%, whereas without it is uses heuristics to estimate that occupancy would be 92% at the end of the gc cycle (maybe, docs aren't 100% clear) [15:22:16] err, 75% and 92% should both be the same, they are whatever the occupancy fraction is [15:26:13] I get that 75% would trigger the gc earlier? [15:26:40] yes, i think that will try and limit the old pool to 75% of total heap [15:26:52] instead of right now it allows to grow up to 92% [15:27:49] hm ok so perhaps not giving enough time for bursts in mem usage to be absorbed and the circuit breaker might kick in to early? [15:28:38] yea that seems like a plausible reading of these settings. [15:28:49] seems like a good thing to apply if that's what the circuit breakers are expecting [15:32:19] totally unrelated, i was pondering dropping the poolcounter adjustments for cross-dc traffic. We wouldn't be able to change them if we switch all read traffic to search.discovery.wmnet and stop setting the specific clusters in mw config [15:32:54] and since we expect cross-dc traffic to also have local traffic, and they have separate poolcounters, seems safer ? maybe we end up expanding the completion suggester pool a bit [15:35:26] err, we would still have specific clusters defined for writes in mw-config, but read traffic would go through a new defined cluster (i called it dnsdisc) that uses the etcd/geodns based routing like other services [15:35:39] so we'd have a single set of settings for 2 different modes? traffic spread and only local mw -> elastic, traffic on one elastic cluster with either local or cross-dc mw -> elastic [15:36:21] dcausse: yea, doesn't seem ideal but i'm not sure how to reconcile having two independant pool counters [15:36:30] yes [15:36:43] for writes I think it has to still be manual in mw-config no? [15:36:47] yea it does [15:38:29] how is that going to work with envoy? [15:38:45] we define three more envoy ports that point at the discovery dns [15:39:03] (and sigh while tracking ports through 4 separate files to make sure it aligns) [15:39:16] :) [15:39:53] the other services such as restbase already use envoy + discovery dns, so it should be well tested by now [15:47:39] hmm, also translate :S [15:48:39] sigh, yes true [15:48:48] this one if a bit different IIRC [15:48:58] it has a "mirroring" concept [15:49:11] not sure read & writes can be separated [15:51:33] hmm, poking the ElasticSearchTTMServer and not seeing the mirroring concept, what is it? [15:52:18] i'm actually not seeing how it writes to both clusters, so clearly missing something :P [15:53:00] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/CommonSettings.php#2949 [15:53:12] it's the "mirrors" option [15:53:28] I think it might be in the jobs not ElasticSearchTTMServer [15:54:32] ahh, yes it merges the local services with whatever comes from mirrors during write :S [15:55:20] could redefine it such that it doesn't merge, maybe need a different name than mirrors though, write-targets or some such [15:55:27] yes [15:56:49] hm... but I think the translate code needs to be adapted [15:56:59] yea, it does [15:57:05] having multiple ingestion piplines is meh :P [15:57:26] :) [15:57:28] could we unify translate into the streaming updater somehow? Would that be silly, have it simply produce kafka messages? [15:57:39] "simply" [15:57:53] i suppose it's totally tangential ... [15:58:51] I don't know... need to refresh my memory about it does writes [15:59:21] s/about/about how/ [15:59:47] there are definitely ttm services that are "read-only" [16:01:01] ah there are "ReadableTTMServer" and "TTMServer" [16:05:20] I think we need a task for translate it's not going to be trivial to separate read&writes, I feel that we'll need a kind of hybrid ElasticTTMServer [16:09:28] yea, i agree. this part of it is a bit more complicated :S [17:28:50] dinner [18:29:43] Few mins late to pairing [21:18:33] an alternate thing we could consider, elastic added jvm.options.d, a directory that will read overrides, in https://github.com/elastic/elasticsearch/pull/51882. Instead of templating jvm options at all we could have a puppet class where each invocations puts a new file in place with specific options that we want configured [21:19:06] that would allow the main jvm.options to come from the elasticsearch package instead [21:32:37] not a bad idea, although since we have multiple instances, might have to think on it a bit. I'm guessing most config is identical? [21:32:46] jvm config that is [21:34:49] hmm, yea having multiple instances does make it slightly more tedious, and we actually need different values (for example, heap size) in the different clusters. [21:35:04] for the most part things are all the same though [21:35:36] except i guess cloudelastic-chi which got a few special values in the past trying to make it work with the oversized data [21:36:36] but as a general case, i feel like the foo.d route tends to work better in the puppet model than templates [21:37:07] keeps the important bits in one place (the definitions) instead of spread out [21:37:34] yeah, I like that approach overall, makes it easier to find where I broke things ;) [21:38:03] you can also do stuff like assemble a single conf file from fragments, but that might be too complicated for this [22:03:59] meh, phpunit's logicalOr doesn't work like i would want ... logicalOr(isType('int'), countOf('3')) to say it should either be an int or an array with 3 elements throws because 3 isn't Countable [22:04:07] err, because an int isn't Countable [23:25:49] * ebernhardson wrote 4 different ways of verifying the index isn't live, not super happy with any of them :P Will ponder overnight and try again tomorrow