[00:51:35] TimStarling: Well, the gate now passs on 8.1, so I could just throw the switch for CI for all skins and extensions, but it'll be a little bumpy… [00:58:25] Fancy [00:59:27] I wonder how many non gate, but wmf deployed and/or tarball bundled things will break... [00:59:36] A few, probably. [01:00:15] It'd be nice for times like this (oh wait, legoktm has a script, right?) to kick off tests to see [01:00:39] I have a script too. But Kunal's goes across loads of repos not just manual ones. [01:02:55] would be nice to do a bit of wider checking first... [01:05:33] I've checked a few repos, it seems OK. [01:05:43] Also I'm running/developing on 8.1 locally. [01:32:41] Yeah. I’ve had my dev wiki on 8 and then 8.1 for a while [02:12:31] James_F: ok, I see you merged that, so how about I announce that change on wikitech-l along with some comments about production migration? [02:15:21] TimStarling: Sounds good! [07:55:52] Krinkle: I think I saw a ticket about some perf/resource usage issues with flamegraphs, we have observed some similar problems and I have a plan to investigate using https://github.com/jonhoo/inferno instead of the Perl scripts here [08:31:46] mszabo: your patch to add core classmap... it only removes code. there's no new code. that's... unbelievable. [08:32:10] Krinkle: yeah, turns out we had all the requisite bits in place :) [08:32:22] it makes sense now that I think about it, given we have a ton of classes that can fall under multiple directories so it has to know before it can decide what to leave out. [08:32:30] The impact does not seem dramatic but from what Amir posted it seems measurable, at least [08:32:42] Yes, and even in the happiest path case, we need to call file_exists() once [08:32:56] And since the stat cache for file functions in PHP is request-scoped, that's a syscall [08:34:14] mszabo: wait you mean after your change, core classes still trigger stat in AL::find? [08:34:26] oh no, I meant before the change [08:34:50] right yeah, it has to since it's ambigious. although I was actually talking about the generator, not the loader [08:35:00] Right [08:35:02] it surprised me at first that to generate more detailed entries you're only removing code from the generator [08:35:29] but then I realized, given we register the same prefix multiple times, we have to know first what we really have in each fiel before we can shake out the things PSR-4 could handle at runtime. [08:35:59] it's realy obvious in retrospect that by removing that information for ambiguity reasons, that therefor inherently means runtime has to rediscover the same ambiguity with notable I/O overhead [08:36:23] after all, if it wasn't ambigious, the generator could just skip anything that matches a prefix when it build the list in the first place, not filter it out, and then runtime can just blindly require. [08:36:28] anyway, nice work :) [08:38:36] mszabo: regarding exttensions adopting psr4 en-mass due to autoloader cost, afaik that was me talking about ExtensionRegistry startup cost in reading large arrays from apcu being slow due to recursively copying the array into the request from shmem. [08:38:55] https://phabricator.wikimedia.org/T187154 etc [08:39:21] Cool yea, then it was APCu I was thinking of indeed [08:40:32] If we get around to generalising some of our "big-scale opt-in" build scripts like GitInfo, dblists, wmf-config, L10ncache, we might at some point be able to justify without noticable complexity to also do this for extension.json [08:40:50] e.g. if it's not a separate maint script for everything that we have to manage, but just osme kind of extendable mechanism [08:41:02] then we could e.g. add to that mechanism a .php array with the extension.json dataset already pre-processed [08:41:12] and thus not use apcu [08:41:42] we could then also e.g. add a classmap to that. [08:41:59] it woudl be a single entry point that scap/docker-build invoke that handles it all in various threads or smth [08:42:52] ... or we can pay for php-apcu to be improved and allow reading of shared memory directly in a Safe (TM) way. [08:43:15] https://github.com/krakjoe/apcu/issues/175 [08:45:09] Yeah, TysonAndre has been active on the performance front of the PECL ecosystem lately [08:46:02] I think for our purposes of extension autoload classmap generation, we won't be needing to use APCu anyways so hopefully that bit should be okay [08:53:38] yeah, even with a better apcu that wouldn't fit well with runtime generation. It's not impossible but would be awkward. There's a lot of things where apcu is more convenient as a cache than opcache files, but this doesn't seem like one of them. [08:54:12] Like... would we run a deferred update to generate the apcu key if it's missing :D? Not impossible I guess. but concurrency, and how long it takes to scan the whole code base etc. [08:54:57] as a one off, we could maybe do it manually on a server once to see if we dump a full classmap on 1 appserver and leave it pooled for a an hour liek that, what it might do to load/latencies. [08:55:06] beyond core that is, e.g. to measure impact of extensions [08:55:14] I guess one could write a specialized PHP extension that implements a PSR-4 autoloader function that caches class<-> file mappings across request boundaries :P [08:55:28] Or there's preload, but that was declined in MW [08:55:33] right [08:55:44] or a stat cache that lasts longer [08:56:27] we need HTTP stale-while-revalidate for syscalls [08:56:33] :) [08:56:42] yea, something that exposes say `file_exists_persistent()` (great name) could be nice too [08:57:09] right given immutable deployment, but tricky for local dev. [08:57:28] I've been looking too much at PHP extensions these days as I was busy recompiling the ones that we use with ASAN to try and get a hold of that JIT segfault bug [08:57:45] opcache has a stat debounce of like 2s or something by default for its revalidation mode that seems good enough for both prod and dev [08:58:00] yeah and in a container world an immutable variant can be used [08:58:04] if it did the same for stat cache and e.g. on top of that does revalidations off the main thread, that'd be awesome. [08:58:52] e.g. opcache_file_exists, naturally in sync with the result of opcache's own mapping and e.g. fresh for 1s and stale for 2s. [08:59:28] although I don't know if wishing for opcache to do more concurrent writes to its shared memory is a good thing to be wishing for. [09:00:26] yeah, some of these things could conceivably be simpler/less risky as a per process (or ZTS thread) implementation, there'd be more memory overhead of course but I think the benefit of not having to deal with an SHM on non ZTS setups are well worth it [09:04:10] and since this'd be replacing (parts of) PHP's own request-scoped stat cache, there might not even be a large overhead compared to that since that's not shared across the process/thread pool either. [09:04:53] yeah, if anything should reduce memory [09:16:29] mszabo: Amir1 https://grafana-rw.wikimedia.org/d/000000066/resourceloader?orgId=1&from=now-7d&to=now&viewPanel=45 [09:16:50] I've added an annotation for the classmap improvement {operations}{mediawiki}{performance} which will show up on various dashboards to correlate [09:17:36] looks like load.php p75 daily bottom went down by several milliseconds to a new low [09:18:30] nice. resourceloader/load.php was likely to benefit from this kind of change [09:18:35] *the most [09:19:07] and even more so during local development on e.g. docker, Win/Mac, mount [09:19:23] where this was a blocker for load.php debug=1->2 switch [09:19:40] as it's faster to load 500 files from apache statically in parallel than to make 20 load.php requests that each take >200ms [09:19:57] as it's faster to load 500 files from apache statically in *serially* than to make 20 load.php requests in parallel that take >200ms [09:20:17] that includes browser req start and JS compile/exec [09:20:23] in between reqs [09:20:38] T85805 etc [09:20:38] T85805: Introduce ResourceLoader debug mode v2 - https://phabricator.wikimedia.org/T85805 [09:20:45] will re-measure later to see where we are on that [09:20:56] might've pushed it over the line for me, that'd be cool [09:22:46] yeah, here's hoping :) [09:30:49] also 2ms off of pageview backend timing, e.g. lower and p50 here: https://grafana-rw.wikimedia.org/d/QLtC93rMz/backend-pageview-timing?orgId=1&from=now-7d&to=now&forceLogin&viewPanel=61 [09:31:02] 36->34ms, 117->115 roughly