[07:41:49] weird... cindy seems to run pig-latin language converters... if you search for endymion there'll be a variant with endymionway we search for as well... [08:10:27] gehel: I'll start with the weekly report. Does essential work need special treatment? [08:44:52] pfischer: sorry for the delay. Thanks for working on the report! If I understand correctly, we currently stopped reporting anything. I'd still like to publish updates on wiki. In particular if we have anything meaningful. [08:45:32] For this week, we should probably have a note about TheDJ working on MySQL search and moving a lot of tickets around on our boards. [08:46:11] Maybe something about the discussion we had around relaxing the AND on Search and that we want to start experimenting with a new profile. [10:35:57] lunch [10:40:12] gehel: https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2025-09-12 - I tried to separate TheDJ-touched tickets from what was worked on this week. I can’t remember that discussion though. Do you still know the tickets we discussed? [11:42:36] is there a way with CirrusSearch web API / Special page parameters to affect `minimum_should_match` at query runtime? I got the impression from a quick look at https://codesearch.wmcloud.org/search/?q=minimum_should_match&files=%5C.%28js%7Cphp%29%24&excludeFiles=&repos= and the puppet config but figured I better just ask before going down a rabbit hole! asking in the context of use of OR versus proximity search versus other [11:42:36] approaches for yielding larger result set for multi search term searches [12:04:05] (I recall discussion about more forward positioning of did you mean [as to part of the code search results], although specifically was thinking about if it was adjustable by an internet client, too) [12:12:33] dr0ptp4kt: of course you can, the question is all about how to deploy it, not really how to pass it to the search engine [12:12:59] dr0ptp4kt: users today expect that if they type three words, they match three words. editors would have to just notice that their 5 word query now matches only 3 of 5 [12:13:36] which is much more about UI and communication than engineering [12:18:54] oh definitely ebernhardson , it was more about if there's a supported query parameter today. if not, would that be something I could put a feature request in for (and possibly implement) ? I was thinking it looks not too bad, but I also know the availability of another parameter can sometimes have knock on effects for what users / programs may do with it [12:22:38] oh, I also just saw the backscroll a few messages upward where MrG was noting a topic about relaxing AND somewhat...though here am more thinking about the programmer side of things using the api for prototyping [12:24:24] dr0ptp4kt: yea its profile parameter [12:24:41] dr0ptp4kt: something like this: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/1187509 [12:27:29] pfischer: the discussion about relaxing AND was during our retrospective, and we decided to experiment with an additional search profile as to not disturb our users. I'll add something to https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2025-09-12. Thanks for the update! [12:36:55] oh ha! ebernhardson I didn't realize there was an active patch. I'll be interested to try out that srqiprofile! [12:38:55] ebernhardson: o/ do you know what the prefix like [0-31] in cindy's output like "[0-31] [10:42:44] [=] [0009/0009] [INCIRRUS] V:N" stands for? [12:39:51] dcausse: hmm, off the top of my head i think its a thread designation from the system [12:39:55] dcausse: but not 100% [12:40:01] ok [12:40:38] dr0ptp4kt: actually that will end up as an srqdprofile, qi is query-independent so had to add a qd for query-dependent. qi effects rescore, qd effects main query builder [12:42:55] oops looks like we're on the same tmux session :) [12:43:08] dcausse: i disconnected, was just double checking what the 0-31 was [12:43:23] and yea, pretty sure thats a thread-id [12:43:27] np! [12:43:30] thanks [13:13:22] sigh... a bit puzzled by cindy with a query that returns different result when run within the test suite and after the test suite... [13:16:23] o/ [13:57:59] dcausse: i also have the date range filters mostly prototyped, planning to look over options for supporting $wgLocaltimezone today. There is also a question of sub-day precision. We can of course support it, but i suppose i was trying to make things simpler by skipping hours. If you have any thoughts would be useful: https://phabricator.wikimedia.org/T403593 [13:58:49] sure, looking [13:59:24] the user timezone we can possibly support this with a dedicated syntax to enable it? [13:59:38] I was thinking of relying on user-pref rather than site default [13:59:54] but was not clear about the level of efforts required [13:59:59] dcausse: i have mixed feelings on timezones...i kinda didn't want queries to give different results based on where in the world they were issued from [14:00:12] yes... [14:00:20] so i was thinking if the wiki has a $wgLocaltimezone, we could localize all times to the wiki [14:00:32] otherwise, UTC [14:01:04] i dont know for sure, but a_smart_kitten said that history page times are always in $wgLocaltimezone [14:01:10] seems sane to keep aligned with them [14:02:36] hmm, actually looks like history page varies on preferences? [14:03:10] yes I think so [14:03:51] I think I'd go with UTC by default and (possibly later) add a dedicated syntax to switch to user-pref (which should default to $wgLocaltimezone if not explicitly changed) [14:04:16] automated client my always prefer UTC I think [14:04:57] on wiki users might be happy to have a small syntax to express times in their prefered timezones [14:05:11] i'm not sure how hard it would be, there is a timezone parameter to the range query, but i was hoping to not have to think it through :P [14:05:31] yea i could see users wanting something that matches, their concept of "friday" might not match UTC [14:05:32] ebernhardson: I think it's completely to leave this for later [14:05:36] *fine [14:06:05] assuming that the syntax allows for introducing this change later [14:06:25] re time-precision I'm not sure [14:06:37] haven't looked at php date-parsers [14:06:45] hmm, the syntax i'm proposing is `(<|<=|>|>=)?(YYYY(-MM(-DD)?)?|now(-\d+[ymd])?)`. We could certainly finagle something in there, or suffix the keyword with `localtz` or something silly [14:07:57] time precision i suppose i was also trying to avoid edge cases, with now-1d or now-1y, it's all in units of days. now-1h is a different unit. Can certainly handle, just adds some extra compliexity [14:08:28] by units, i guess i mean the result, the result of now-1y is a date, the result of now-1d is a date, but now-1h is a datetime [14:09:02] oh haven't thought about date vs datetime [14:09:22] I imagined that internally you would treat datetime anyways [14:10:18] I have a feeling that users might find slightly odd to not have hours but just a guess [14:10:55] i do use the DateTime class for parsing, i'll have to play with the syntax, but its how `now-1d` is passed to opensearch as `2025-09-12||-1d`, but now for hours needs the date and the time. not a huge complication, but just the start [14:12:47] oh silly me, should have looked at opensearch query capabilities, I simply assumed that you would have to run range queries using timestamps as int [14:13:40] yea opensearch is actually providing "most" of the functionality. So like >2025 vs >2025-09-12, thats all handled by opensearch. It correctly differentiates between >2025-01-01 and >2025 [14:14:07] although its passed slightly different from this syntax [14:14:24] indeed it's nice but that assumes we have proper understanding of dates [14:14:49] and cannot simply resort on $date->toTimestamp() or the like [14:14:56] indeed :) [14:15:21] so far i was trying to only validate, but not manipulate, the date provided, instead matching the syntax to a date format opensearch accepts [14:17:10] i'll ponder the hours bit...maybe it doesn't have to be too complicated as long as we restrict date math to the 'now' keyword [14:17:40] i suppose i was worried about follow-on complications, but there might not be too many [14:17:51] sure [14:18:47] dates are always a minefield [14:21:29] it's not all the complicated...but i guess i figure it would have way less edge cases with no hours, and only UTC. but maybe i'm just trying to be too strict, it also needs to do what users need. [14:23:43] the timezone thing we can probably ponder that for later, the hour think Mickael seemed to want this and personnaly I think that might be useful but I won't die if it's not there :) [14:41:30] my test is failing because we're logged-in apparently the completion settings I override in the config is erased by user-prefs... [14:42:56] might have to change my strategy for testing this [14:51:03] means that there are some settings that might be hard to a/b test if they're part of user-prefs (default completion profile) [15:08:35] hmm, i hadn't considered that :S [15:15:41] no big deal it was just for testing things with cindy but probably something to consider if we plan to add more user-prefs [15:16:41] mainly reminding myself of discussions with Trey regarding how we might configure second-try-searches and whether we want user-prefs for that or not [15:49:12] enjoy the wekend! [15:49:21] sigh... so easy to make mistakes with how php seriliazes array as json... array_unique & array_merge of simple arrays might end-up being seriliazed has {"0": "val", ...} instead of [ "val", ... ] [15:52:18] yea, thats always been some annoying bits. IIRC the solution is usually to array_values() if it must be a list [15:54:19] yes [16:00:23] workout, back in ~40 [17:03:11] back [17:07:36] started to hand pick examples for dym eval on frwiki, turns out to be a bit trickier than expected... [17:07:51] will continue next week [17:07:58] heading out, have a nice week-end [17:57:02] doh...yesterday i tested that `gte:2025 lte:2025` returns results from 2025...but it only kinda does. it returns results from 2025-01-01 :S Guess i do have to do a bit more date manglingly locally [18:10:14] lunch, back in ~40 [18:14:21] nifty, actually they just require using a rounding syntax, so `gte:2025||/y lte:2025||/y` does the trick [18:14:52] 122 [18:34:04] * ebernhardson briefly considers `return $this->a() ?? $this->b() ?? $this->c()` to chain null returning functions...but considers probably not :P [21:03:59] ryankemper do you have anything for pairing? I'm still doing the K8s thing