[07:36:02] Gonna be 30 mins late to weds meeting; there’s gonna be a maintenance power outage for the first 6 or so hours of my day so that should ensure I have plenty of battery power for the interview mike and I have after unmeeting [08:46:16] dcausse: I would be 20mins late for our meeting [08:46:46] ejoseph: no problem! please ping me when you're back [09:32:33] i am available now [09:33:13] ejoseph: ok [10:49:25] lunch [13:10:01] greetings [13:12:49] o/ [13:13:05] ohi! Where is the code for the job that indexes Items quickly after creation for wikidata? [13:13:05] addshore: not sure what you mean? we had in the past something that bypassed the job queue to index newly created items but this got removed [13:16:27] Aaah! [13:16:50] Is there a ticket for it getting removed? / why etc? [13:17:13] (this has been on my plate to chase up for a while about why things got slow) [13:17:18] looking but it was removed a long time, did you notice something recently? [13:17:31] nah, noticed quite some time ago, i just havn't had the time to investigate [13:17:49] it was removed because now we do a mysql lookup [13:18:07] Got a link to that? [13:18:25] looking into archives :) [13:19:12] TLDR there i guess is that not all entities that need to appear quickly have any searchable data in SQL [13:19:16] particauly lexemes [13:19:42] indeed I remember that lexemes do not have that feature [13:19:53] esp. Forms [13:20:13] but searching for LXYZ should [13:26:11] addshore: the code was removed in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/548930 the code to do a mysql lookup was added in T206256 [13:26:12] T206256: Un-redirected items should be instant-indexed - https://phabricator.wikimedia.org/T206256 [13:27:26] we also changed a lot the ways page gets indexed throw the jobqueue that might have increased the latency [13:27:43] s/throw/via [13:28:26] hmmmm [13:28:52] i'm struggling to see how https://github.com/wikimedia/mediawiki-extensions-WikibaseCirrusSearch/blob/master/WikibaseSearch.entitytypes.php#L24-L48 does the terms lookup in sql [13:29:19] that just looks like lookup up ID thencirrus lookup? [13:29:36] EntityIdSearchHelper is relying on cirrus? [13:29:56] EntityIdSearchHelper uses SQL, but only lookips ids [13:30:45] yes this was the workflow this instant indexing was made for IIRC [13:30:56] using entity IDs [13:31:40] Oh, ( and lydia) thought that the instant indexing was for the usecase of using labels as soon as possible [13:31:50] if not then it's a matter of defining what's acceptable latency for search [13:31:58] Lydia_WMDE: ^^ [13:32:48] bypassing the indexing pipeline for improving the latency is not a good idea imho [13:41:58] Hi team - linking this message from the sdaw-search-experiments channel on Slack for those who don't check there often: https://wikimedia.slack.com/archives/C030Q2LL63T/p1648647291702339 [13:43:57] thanks! [16:04:22] quick workout, back in ~30 [16:43:04] back [17:29:15] lunch, back in ~30 [17:57:05] back [19:17:46] huh, curious. For no particular reason i ran the snipped from wikitech:Search that prints long running tasks from the cluster. Most instances report some 50 day old searches, they all seem to refer to a cross-cluster search in one way or another [19:17:58] we are about to switch versions of elastic, so i guess not worth investigating, but curious [19:18:33] s/snipped/snippet/, specifically https://wikitech.wikimedia.org/wiki/Search#Tasks_Api [19:20:22] maybe add a prometheus metric for oldest running task on the host and alert when something suspicious happens. I suspect if these still show up in active tasks they are taking spots in the thread pool, but uncertain since they are cross-cluster searches and may be a bit special [20:42:07] lunch [21:18:49] back [21:41:39] ebernhardson interesting. If you wanna pop a phab ticket for that, could be fun to run down