[08:16:58] <gehel>	 I've created T404822 after a semantic search meeting yesterday.
[08:16:59] <stashbot>	 T404822: Analysis: how many search queries are using natural language vs keywords - https://phabricator.wikimedia.org/T404822
[08:17:47] <gehel>	 ebernhardson: Is the description reasonable? 
[08:18:43] <gehel>	 My expectation is that this should be relatively easy to implement. And probably easier to do ourselves than to delegate. If this is more complex than I expect, let me know and I'll see if we can find support.
[08:24:07] <gehel>	 If we work on relaxing AND in queries, it would make sense to create a hypothesis and attach it to WE3.1.
[08:24:54] <gehel>	 Astuthi can help us navigate the administrative complexity, but just a hypothesis is relatively lightweight.
[08:25:08] <gehel>	 Any volunteer to own this hypothesis?
[10:19:19] <dcausse>	 lunch
[13:19:50] <inflatador>	 <o/
[13:28:06] <dcausse>	 o/
[13:39:17] <ebernhardson>	 \o
[14:13:05] <gehel>	 ebernhardson: Debra is joining our Wednesday meeting. If you can be there that's great! Otherwise, we'll keep you updated async
[14:13:33] <gehel>	 ebernhardson: Peter told me you might have a notebook with some data related to T404822 already...
[14:13:34] <stashbot>	 T404822: Analysis: how many search queries are using natural language vs keywords - https://phabricator.wikimedia.org/T404822
[14:14:21] <ebernhardson>	 gehel: first hour i'll be around, it's early so workers wont even be here yet :) But not sure on second hour
[14:14:33] <gehel>	 good enough!
[14:15:12] <ebernhardson>	 gehel: for natural language queries, i don't think i have any particularly relevant notebooks. I've pondered basic things looking for the who/what/why/where/when words, but never got around to it.
[14:15:50] <ebernhardson>	 reading martins summary...i would have to spend some time disecting and understanding the definition :P 
[14:16:54] <ebernhardson>	 for things like "contain a categorical noun phrase immediately preceded by a preposition or relative clause;" i just dunno, i guess some sort of POS tagger? It's something i would have to spend time with
[14:48:01] * pfischer can't make it to the Wednesday meeting tonight
[14:56:01] <gehel>	 ebernhardson: my understanding of the discussion the other day is that we should find a simple heuristic, and we don't care about too much about having a super precise categorization
[14:56:02] <mgerlach>	 ebernhardson: I dont think we need/should use that exact definition I shared in T404822 since some of the aspects might not be straightforward to implemented. probably something much simpler derived from that description will do the job for our purpose. I havent thought deeply about this yet but would be happy to connect and brainstorm more, if needed. 
[14:56:03] <stashbot>	 T404822: Analysis: how many search queries are using natural language vs keywords - https://phabricator.wikimedia.org/T404822
[15:01:48] <ebernhardson>	 mgerlach: that makes sense, from my side something like tokenizing / stemming (normalizing) and looking for particular words in historical queries is reasonably easy, but we haven't done anything more complex when it comes to actual language recognition
[15:54:04] <inflatador>	 workout, back in ~40
[16:32:02] <mgerlach>	 ebernhardson: yea, I think tokenizing/normalizing might be enough. from my perspective, the goal should be to come up with some heuristic thats relatively straightforward to implement and that we still believe is meaningful
[16:47:55] <dcausse>	 dinner
[16:54:47] <inflatador>	 back
[16:55:00] <inflatador>	 ryankemper we have a new Envoy update ticket if you wanna have a look when you get in T404867
[16:55:01] <stashbot>	 T404867: Upgrade Envoy to v1.29.12 on wcqs and wdqs hosts - https://phabricator.wikimedia.org/T404867
[17:02:22] * ebernhardson ponders tokenizing queries in spark via http, vs tokenizing queries in relforge by indexing them
[18:32:42] <gehel>	 Reminder: I'm out Friday and Monday (Oscar's birthday + public holiday)
[18:33:14] <gehel>	 pfischer: can you prepare the triage meeting without me and facilitate (as you've been doing anyway lately)
[18:39:02] <inflatador>	 back