[00:04:37] <Krinkle>	 TimStarling: AaronSchulz: I see the puppet change for site->region landed, I'm trying to recall what the issue may've been. The best I could come up with is that unlike /dc/mw, /dc/mw-wan broadcasts from each, and so where /eqiad/mw goes to eqiad, /eqiad/mw-wan might try to read from both or something. It's not obvious to me at glance what `/*/mw-wan` will do now that its default policy expands to two DCs instead of two aliases for the 
[00:04:38] <Krinkle>	 same thing for things that I guess don't need to broadcast.
[00:04:56] <Krinkle>	 noting that we use `/*` in wmf-config for that route, not `$dc`
[00:06:23] <Krinkle>	 reviewing WANCache.php, I see broadcastRoute is only used for delete() and for set/setMulti. The puppet code had comments about add/cas/touch.
[00:07:48] <Krinkle>	 so I guess it's not really used. Maybe we could/should simplify it so that we don't code in puppet what commands are used, but just do the same sync/async generally, less business logic there, and no unused logic that we never tested/use until we do.
[00:10:24] <TimStarling>	 always nice to simplify things
[00:46:44] <Krinkle>	 TimStarling: a few Qs for later / before I log off; what does WR in WRStats stand for? Do you foresee using this for pingLimiter as well? (if so, we're one step closer to considering LocalClusterCache as internal / not needing to be exposed, given we'd only use it indirectly if at all, e.g. either WAN or WRStats) What is your thinking regarding default behaviour? Afaik ping limiter is basically unlimited on stock installs without memc 
[00:46:44] <Krinkle>	 (given cache_none), I see you currently have CACHE_ANYTHING set, which I assume means CACHE_DB by default; I suppose it's worth thinking about exaclty how slow that would be for stock installs, local dev and CI if left as-is. (ref T186673, T162077, T295660)
[00:46:44] <stashbot>	 T295660: Browser cache never used for logged in users when $wgMainCacheType is CACHE_NONE - https://phabricator.wikimedia.org/T295660
[00:46:45] <stashbot>	 T162077: User::pingLimiter always fails to limit when $wgMainCacheType is CACHE_NONE - https://phabricator.wikimedia.org/T162077
[00:46:45] <stashbot>	 T186673: Deprecate and remove CACHE_ANYTHING; cache on by default - https://phabricator.wikimedia.org/T186673
[00:49:19] <TimStarling>	 1. It's just a name, can be whatever. Etymology is: we have statsd/metrics but they are write-only, these are read/write statistics. But let's invert the order of the acronym to indicate write optimisation and also make it less likely to conflict with other things.
[00:50:36] <TimStarling>	 2. Yes that's why I linked to the ThrottleStore bug which is about a backend for pingLimiter. But I am splitting the patch because there are a few details to do with rate limiting that are not worked out
[00:54:11] <TimStarling>	 3. I think the existing behaviour is $wgMainCache as defined by the installer, there's no $wgObjectCache key for that, which is inconvenient. CACHE_DB is probably not too bad assuming it implements getMulti() efficiently.
[00:54:50] <TimStarling>	 I guess there may be a problem with purge performance
[00:55:10] <TimStarling>	 s/MainCache/MainCacheType
[00:58:02] <Krinkle>	 ack, I'm less worried about the batch reads, I expect those to be not in the critical path / to optimise that once there's a need for it. and purges are post-send for sqlbag. It's more whether write_background setMulti for sqlbag would make sense. It's certainly possible. We do it in KeyValueDepStore, but it's done by the caller by deferring the setMulti call itself. I suppose one optoin could be for WRStats to take an optional 
[00:58:02] <Krinkle>	 `asyncHandler` option that we could use to defer even the setMulti call itself which be a fairly small optim for mcrouter, but still something presumably, and more notable for SqlBag. Just an idea to consider, haven't thought it through. 
[00:59:20] <TimStarling>	 yeah, AbuseFilter with its hundreds of metrics to be written after save is probably too slow for SQL
[01:01:05] <Krinkle>	 shaving off ~7ms for 100 incrs with mcrouter be a not-so-micro win for prod as well I suppose (extrapolating down from the 1000 in 71ms you reported)
[01:02:31] <Krinkle>	 ah wait, those benchmarks were without WRITE_BACKGROUND support for incr(), right?
[01:03:03] <Krinkle>	 I imagine a 1000 localhost messages would go faster than that.
[01:03:24] <TimStarling>	 the benchmarks were with incr(), the WRITE_BACKROUND thing is for incrWithInit()
[01:03:56] <TimStarling>	 1000 localhost messages do indeed take 60-70ms in production
[01:04:09] <Krinkle>	 right incr() already uses AsyncClient
[01:04:25] <Krinkle>	 okay, wlel, taht seems worth deferring then I suppose.
[01:05:11] <Krinkle>	 Or we can document a recommendation for callers to be in a deferred update, that might be more efficient 
[01:05:12] <TimStarling>	 not sure where the bottleneck was
[01:05:41] <TimStarling>	 note that my WRITE_BACKGROUND patch uses the async option in libmemcached, whereas my benchmark did not do that
[01:06:23] <TimStarling>	 so maybe that will speed things up
[01:07:06] <TimStarling>	 but it may just be MW CPU usage, if so I can optimise that
[01:07:48] <Krinkle>	 O
[01:08:13] <Krinkle>	 I'm not following the "async option in libmemc" - I see incr() uses acquireAsyncClient, and your patch makes doIncrWithINitAsync use that as well.
[01:08:56] <TimStarling>	 incr uses acquireSyncClient
[01:09:11] <TimStarling>	 maybe you missed the relevant one-letter difference
[01:09:21] <Krinkle>	 indeed
[01:10:03] <Krinkle>	 right, it returns the result which is probably the new value; I thought that was a contract we only offered/needed for incrWithInit
[01:10:54] <Krinkle>	 okay, so it's not the async client; that's what I meant with the benefit of WRITE_BACKGROUND not being captured by the benchmark
[01:11:27] <Krinkle>	 so I guess switching from the synchronous incr() to the async incrWithInit would likely speed it up, even if it's two commands instead of one.
[01:22:52] <TimStarling>	 regarding your `asyncHandler` idea, in AbuseFilter, stats update is already done as a deferred update
[01:23:35] <TimStarling>	 so with the current code I am assuming the cost will be ~170ms post-send
[01:23:56] <TimStarling>	 bearing in mind that it is already doing this with redis and we don't know what the cost of that is
[01:24:55] <TimStarling>	 I didn't think WRStats needed awareness of deferral since the caller is already calling it in a deferred context, and the store can do its own deferral too
[01:28:08] <Krinkle>	 ack
[04:31:25] <ori>	 what's the WR in WRStats?
[06:05:32] <TimStarling>	 ori: scroll back to the line I started with 1.
[13:50:24] <mszabo>	 Hi, I made a small optimization that should reduce the upfront cost of requests that abort early (e.g. some forms of redirects) - https://gerrit.wikimedia.org/r/c/mediawiki/core/+/808230