[02:58:00] bd808: afaik it isn't expensive by default (stock/core). Looks like CommonsMetadata isn't backed by any secondary table like page props or link table but rather requests a ParserOutput for each. We generally expect one of those per request not N. Indeed unless they happen to all be ParserCache hirs that's going to be slow and even if they are it'd be serial fetches. [03:00:18] One of these days you'd expect it to feed into WikibaseMediaInfo instead of runtime scraping. [03:01:44] Something like {{Infobox/fromWikidata}} but for licenses and other stuff instead of duplicating. But that requires a long tail of read systems and write systems to use that. [03:02:46] If that isn't feasible in medium term, it might be worth looking at storing some of that in page props or a separate table as part of edits/refresh links. It'd be pretty big of course but seems doable / appropriate. [03:03:08] Or add to the core file metadata blob.. [03:03:55] (That's currently generated on upload only. Not desc revisions.) [11:28:21] IIRC there are two layers of caching, parser cache (which is split on description-page-as-viewed-on-the-wiki and description-page-as-viewed-from-another-wiki and CommonsMetadata uses the latter), and the WAN caching in FormatMetadata::fetchExtendedMetadata() (30 days) [11:29:15] of course if you methodically go through all images via some API generator, your requests mostly won't come from cache. [12:02:44] Right yeah that's a measure to limit load for infrastructure, not end user latency. As per the "design for cache miss" principle at https://wikitech.wikimedia.org/wiki/MediaWiki_Engineering/Guides/Backend_performance_practices [12:25:21] Proposed i18n patch to highlight it being slow based on bd808's surprise: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1135945