[12:22:21] this week we added key-class scoped stats to APCu to troubleshoot some observed performance degradation during extended non-deployment periods (i.e. over the weekend), and there is some interesting behavior from ResourceLoader::filter [12:22:52] it grows over time to rival the memory consumption of ER keys https://usercontent.irccloud-cdn.com/file/rYfDC9DW/Screenshot%202023-09-02%20at%2014.21.15.png [12:23:25] we'll likely test next week how it behaves if we lower the current one day TTL to a smaller value to avoid that [12:23:54] this growth most likely happens because ResourceLoader::filter is also used to hash per-page RL client JS output (RLPAGEMODULES etc.) which causes very high cardinality [12:25:01] it could also be the cause for the slow but steady fragmentation climb visible on https://grafana.wikimedia.org/d/yK1IBFaZk/php7-apcu-usage-wip?orgId=1, of course I cannot say that with certainty as there are no key class scoped metrics there (yet!) [20:33:01] mszabo: the per-page blob is meant to have nocache=true [20:33:04] or rather cache=false [20:33:26] since it also includes user tokens that literally can't be reproduced [20:34:41] mszabo: are these stats exclusively from web servers that run immutable php-fpm instances? e.g. k8s or php-fpm restarts per any kind of deployment? [20:35:38] we use global keys that are shared across languages, skins, and wikis. so in theory that lowers the cap a lot already, using content hashes, so that stuff that doesn't vary (e.g. simple file modules) shares the same key. [20:36:47] some modules will vary by skinStyles, or by language code (message blobs), or by wiki (e.g. config data embedded). [20:37:02] although for message blobs I believe we stitch it in separately, already pre-minified JSON. [20:37:58] mszabo: you might be able to pinpoint the module that is causing the churn through the metric used at https://grafana.wikimedia.org/d/000000067/resourceloader-module-builds?orgId=1 [20:38:23] that module will have its version hash change a lot, and so it'l be requested more often and thus build more often. [20:38:44] https://grafana.wikimedia.org/d/000000430/resourceloader-modules-overview?orgId=1&viewPanel=34 [20:38:50] (takes a while to load) [20:40:18] https://usercontent.irccloud-cdn.com/file/KVLf8aOi/rl-modules-overview-build-rate.png [20:40:38] https://usercontent.irccloud-cdn.com/file/DFCSXHCd/rl-module-build-rate-ext_wikimediaBadgets.png [20:40:49] looks like WikimediaBadges regressed at WMF, for example.