[07:06:03] ebernhardson: yes the all field is annoying in this context but I suspect most of the usecases will rely on a keyword, perhaps we can just document this as a known issue for the first iteration. If truly annoying we can ship a dedicated query builder not hitting the all fields... [07:25:14] o/ [07:28:31] o/ [08:00:27] errand [10:35:50] lunch [12:48:34] \o [12:49:47] o/ [12:51:15] hm... less than 2Gb free on cindy... 22G in /var/lib/docker will do some pruning [12:51:56] at least it complains now instead of failing mysteriously...but yea it just generates more and more random cruft to be cleaned up [12:52:23] indeed [12:56:03] sigh... "docker.service: Deactivated successfully." because "docker.service: Start request repeated too quickly." [12:56:15] lol, very helpful [12:56:16] just ran "docker system prune -a -f"... [12:58:08] and now it just works... [12:58:29] ok started to mess-up with cindy again, sorry in advance if it's annoying [12:59:39] random curiosity, i BuildDocument we take a page or a revision, if it's a page we fetch the revision but use WikiPage::isRedirect, if it's a revision we new up a page, but use $rev->getContent()->isRedirect(). I wonder why it differs [13:00:02] i would guess since we allow rev based is should always be from the content [13:01:03] yes always wondered how these could differ, I think the is_redirect flag is to allow fast access and hopefully accurate enough [13:02:15] i think it only "should" differ if we are rendering an old revision...hmm will probably just normalize so it always pulls from the content [13:02:25] indeed [13:11:26] hmm, i guess in part this is probably optimizing the SKIP_PARSE use case, used when link counting (which we don't do in cirrus anymore, but external sites do). probably nota big deal [13:12:33] the saneitizer might check this, but IIRC it should use getContent() already? [13:14:20] saneitizer does both :) If fastRedirectCheck is set it uses Page, otherwise Content. But we always set that to false in the api methods [13:19:30] * ebernhardson also realizes buildRegexWithGroovy is somehow still in cirrus...will drop that one [16:24:15] haven't verified, but this is what claude spit out from a one-shot prompt: https://people.wikimedia.org/~ebernhardson/explain-intitle-match.html [16:24:20] at first glance, seems plausible [16:24:51] probably will run into issues with analysis chains though, it wouldn't know how titles get normalized [16:35:53] yes, sad the API does not allow to return all highlights not only the ones it thinks are the "best" [16:37:22] it probably could via a debug flag, although i'm not sure there are any other use cases. I've pondered before a debug api that fronts the analyze api, but again not sure it's all that useful