[08:32:38] dcausse: fyi I'm going to merge the fixed alert for the jobqueue [08:32:56] claime: ack, thanks! [09:59:36] pfischer: can't remember if we discussed pros&cons of running a separate thread for doing the json parser/data extraction at: https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/blob/main/cirrus-streaming-updater-producer/src/main/java/org/wikimedia/discovery/cirrus/updater/producer/graph/RevisionFetcher.java#L85 [10:00:49] was doing a small refactoring and stumbled on this where this would help to run that step in the same thread as the response callback [10:02:44] dcausse: just a moment, I’ll have a look [10:05:15] oh wait thenCompose is not running async, thenComposeAsync is but we're not using this [10:08:33] Yes, that’s right. What kind of refactoring are you working on? [10:10:54] making small steps to make the RevisionFetcher a bit more generic so that it does not depend on the presence of th revision id [10:11:42] pushing a small WIP patch in my personal space so that you can see [10:14:21] https://gitlab.wikimedia.org/dcausse/cirrus-streaming-updater/-/commit/ea0f9f3b1d1679592fc559a1e08fa38ee2d517f0#9ff3487f0c062a9a3ff5cf7615b150ff8d07550f_82_69 is the place where I drop the themCompose [10:14:23] Thanks! [10:15:33] idea is moving getRevisionData to somewhere closer to where the API call is constructed (reason why I rename UriBuilder to CirrusDocEndpoint) [10:16:05] it's just me exploring, not sure if all this will work out in the end [10:16:31] so if you have concerns with some of this let me know :) [10:16:55] lunch [10:28:28] Looks good to me, I like the idea of encapsulating related business logic and keeping it in one place. [12:26:38] thanks! [13:08:25] o/ [13:18:08] dcausse sorry for the short notice...Olja is off the rest of the wk and wanted to talk Search Update Pipeline today, so I invited you to a mtg. I think I have most of what I need based on our discussion last wk so if you can't make it, that's totally fine [13:30:37] inflatador: I can make it but I think you should invite Peter [13:31:42] pfischer ^^ I just added you to invite as well. Should be similar discussion to last week [13:34:34] Sure [13:36:49] Oh, that's in deed on short notice. Have to leave for 1h [13:40:15] pfischer np, I think we have what we need based on last wk's discussion [14:39:34] dcausse: I would implement the noop_hints_set as part of my efforts to implement redirect handling, if that’s okay for you. [14:41:50] \o [14:42:44] pfischer: sure! [14:42:46] o/ [15:03:33] pfischer: triage https://meet.google.com/eki-rafx-cxi?authuser=0 [15:58:49] sheesh, to evaluate rdfox against the wikidata graph they needed an instance with 2tb of memory [16:00:22] :) [16:00:35] wow [16:04:22] this whole paper makes graph databases seem meh :P https://2023.eswc-conferences.org/wp-content/uploads/2023/05/paper_Lam_2023_Evaluation.pdf [16:04:42] "To measure the export time, we set a timeout of 4 days ... RDFox is the only triplestore that succeeded in exporting Wikidata within the timeout" [16:05:39] "GraphDB took 28 days and 8 hours to export Wikidata" [16:07:10] I already said "wow"—I don't know where to go from here. 𝓌ℴ𝓌, maybe? [16:07:17] lol [16:46:18] * ebernhardson wishes gitlab would link phabriactor from the Bug: line [17:12:06] I whish git had a nice way to force git mv after the fact if it did not detect the rename in the first place... [17:12:16] dinner [17:14:38] sadly git doesn't actually have a "mv" concept, a move is the same as a delete and an add. [17:15:33] digging into that years ago is how i ended up using `git log -S` for investigation instead of `git log ` in many cases [17:17:57] lunch, back in ~1h [18:15:31] back [18:28:27] sigh...we're currently seeing 75k log messages per 10 minutes from cirrus. It looks to be the exact same query over and over again. Since perhaps 6-28 [18:37:04] * ebernhardson decides to continue ignoring..doesn't seem to be causing any particular problems [19:13:26] bleh [19:58:57] * ebernhardson wonders a little why we don't recognize and complain to users directly instead of failing in elasticsearch for constructs like AND, OR, && [19:59:48] i guess it's a problem of context...they are valid in some cases and detecting which is annoying