[18:28:05] Krinkle: is there a good reason we need to get the pygments version on every pageview? [18:28:47] is it because it's needed for cache versioning of the CSS RL module? [19:05:31] ori: Yes - it's not every page view in the sense of the /wiki/ page view request, but in the sense of the (first) load.php startup module cache miss on that server after php-fpm/apcu restart; which is every time when staging a commit on mwdebug during your own page views there browsing around (CDN miss). [19:05:59] and yeah, it's keyed into the "estimate module content" hash that we compute. [19:06:35] whether we ask for the CSS and hash that or ask for the version, either way we end up needing the version since the version is part of the css cache key as well, but I believe we only hash the version right now. [20:16:17] it's unfortunate that getting the Pygments version has become so expensive with the move to shellbox [20:17:50] so right now the extensions uses the bundled version by default, unless you provide an explicit path to an alternate version to use, which is what the WMF does in prod [20:18:02] should we also require the version to be given explicitly? [20:18:59] unless adding the memcache fallthrough for APC makes this a nonissue [20:21:35] ori: 1h in apcu, and deduped/shared via memc; should be good yeah [20:21:42] looking at the effective latency via https://grafana.wikimedia.org/d/lqE4lcGWz/wanobjectcache-key-group?orgId=1&var-kClass=pygmentize_css [20:21:58] css != version, but boldly assuming similar cheap latency [20:22:18] wtf, this is amazing [20:22:28] I didn't know we had per-keygroup metrics/dashboards [20:22:33] 195ms for a recompute seems a bit absurd [20:22:40] yw :) [20:22:54] instrumentation is Aaron's work, dash is mine [20:28:04] On a completely unrelated note, I'm switching my blog to wordpress and ended up going for https://github.com/scrivo/highlight.php for syntax highlighting (via https://github.com/westonruter/syntax-highlighting-code-block ). [20:28:10] back to* [20:30:09] ObjectCache doesn't add jitter to TTLs by default, right? [20:31:01] perhaps it should [20:32:40] we currently sprinkle mt_rand(TTL_DAY , TTL_DAY+TTL_HOUR) or 2*TTL_HOUR in a few places like that. [20:32:52] we also have adaptiveTTL(mtime) but that doesn't apply here [20:33:03] with ~200ms recompute time this could easily lead to stampedes [20:34:33] agreed, in so far as that while it's not frequently called (load.php overall is ~100/sec, of which a subset startup module or pygments.css), but might be some overlap this way indeed esp given php-fpm restarts regularly now after every scap sync [20:34:41] yeah [20:34:59] but we currently don't memc, so it wouldn't lead to more than the stampede we have today. [20:35:16] which is https://grafana.wikimedia.org/d/vcOTDuSnk/syntaxhighlight?orgId=1&refresh=30s [20:35:19] right [20:35:27] I didn't -1 [20:35:30] not a defect in your patch [20:35:38] but might as well add it since you're touching the caching no? [20:35:57] * Krinkle realizes there's a chance of +2 nearby [20:36:00] jitter-by-default sounds like a good policy for object cache, there must be many places [20:36:13] where stampedes can occur [20:36:56] I'll file a task [20:37:46] WANCache does this by default in so far that it randomly decreases the TTL at get* time with pre-emptive regeneration [20:38:01] but we don't have something like it for raw BagOSTuff use, such as Apcu [20:38:50] we could indeed do something like... if the ttl is more than a minimum threshold decrease it by a small random amount for a small random amount of sets. [20:39:02] back in a bit, picking up Noam from camp