[08:47:28] addshore: Cool diagram! [08:47:42] Would you say its roughly accurate? [08:47:50] I think the updater side is a bit confusing. [08:48:03] yeah, I think I might colour code or, and or number some of the lines [08:48:08] Or, have multiple diagrams [08:48:15] We've been bad at naming the components for the new Streaming Updater, which adds to the confusing [08:48:44] I'm not sure what "repository" means. A Wikibase instance? [08:48:51] yup [08:49:02] a Wikibase Repository [08:49:27] https://wmde.github.io/wikidata-wikibase-architecture/Glossary.html#wikibase-repository [08:49:34] I should probably use the fill name if it is missing! [08:49:59] The streaming updater has 2 components, the Flink side that creates an RDF stream and the updater that consumes it on each blazegraph instance. [08:50:56] Where the old updater gets the list of updates from recent changes or kafka and gets the actual data from Wikibase. That part isn't super clear on the diagram and is probably worth exposing, since it's a major difference [08:51:17] Awesome, Let me add a comment and I'll take another pass at it [08:51:52] 2 different diagrams for the streaming vs old updaters might make sense. Or at least a better way to see the differences (the color coding might be enough) [08:52:52] Once I get this one in, I'll be heading straight to the "deployment view" and try to show off Wikidata.org, wikibase-docker and also wbstack style deployment [08:53:14] and then also some "runtime views" etc [08:56:24] addshore: looking forward to it all! [08:56:56] ejoseph: are you around? I forgot to schedule our 1-on-1 for this week. Ping me when available [08:57:14] Good morning [08:58:36] ejoseph: meet.google.com/suc-fqgd-eip [08:58:56] I will be available in 30 minutes [08:59:11] ok, I'll be there! [10:24:13] ejoseph: for reference: https://www.mediawiki.org/wiki/Code_Health_Group/projects/DevEd/Workshops [13:13:41] gehel: so the flink side actually also talks to the wikibase preository to get rdf right? [13:14:50] addshore: correct [13:15:45] and the blazegraph side only consumes the RDF stream, but never talks directly to Wikibase. The updater running next to Blazegraph becomes a completely dumb piece of code that just applies blindly the RDF stream. [13:16:31] dcausse and zpapierski might have better ideas as to the naming. I think we've talked about both the Flink part and the updater next to Blazegraph as "the updater", which is confusing. [13:19:30] ejoseph: have you seen the email about the cultural orientation? [13:20:28] Yes i did [13:20:30] let us know if it conflicts with any of the team meetings [13:20:44] (by rejecting the team meetings, so we'll know we don't need to wait for you) [13:22:00] Oh ok [13:35:45] I might keep the high level diagram as something like this, using lower level diagrams to dive into the details https://i.imgur.com/2TkiUCZ.png [13:40:00] addshore: seems reasonable [14:01:36] ejoseph: how is your train refactoring going? do you have a link to a newer version? [14:03:58] addshore: both are called streaming-updater-X, X being producer (flink) other is consumer (on instance process). P.S. I'm not here. [14:04:07] xD [15:36:23] in theroy could the flink thing consume and emit a stream to and from something other than kafka? I'm guessing yes based on https://flink.apache.org/ [15:37:09] addshore: the people who would know are all out today, but probably [15:38:01] cool ill come back another day with that qu ;) [15:46:51] gehel: my java knowledge is limiting, I pushing to github [15:47:40] addshore: yes, but probably not out of the box. Ask again tomorrow, you'll have a more detailed answer [16:12:23] ejoseph: I've added you to the #q2-fy2122-onboarding channel on Slack. It is meant as a private and friendly place for a few of the people who joined this quarter. At this point it is only shared between Search Platform and WMCS. [16:12:37] Feel free to ask any questions you might have in there! [16:16:48] addshore: in theory, Yes and even without much work - we kept the pipeline logic and source/sink configuration separate. In practice, we didn't really test anything else. If you want I can give you a tour around the relevant code, when I'm here (because I'm still not here). [16:18:56] gehel: I did not get any notification on slack [16:19:23] Oh i see it now