[09:50:02] pfischer: do we already have a metric somewhere about T341227 ? [09:50:02] T341227: Make local_sites_with_dupe filter configurable and count duplicates - https://phabricator.wikimedia.org/T341227 [09:51:18] The config must be updated, for my changes to become active. I forgot to open that CR. [09:51:36] Once active, CirrusSearch should produce that metric. [09:52:09] Looks like we'll publish to graphite under CirrusSearch.results.file_duplicates [09:53:33] but no data published yet: https://graphite.wikimedia.org/?width=1613&height=807&target=MediaWiki.CirrusSearch.results.file_duplicates.count&from=-24days [09:53:41] I'll push this back to in progress [09:55:53] Weekly update published on https://wikitech.wikimedia.org/wiki/Search_Platform/Weekly_Updates/2023-08-25 [11:19:01] lunch [12:33:32] gehel: Will you record the Learning Circle meeting later on? I’m definitely interested but I’m afraid I won’t be able to make it. [12:43:53] Yep, we'll record [13:40:00] ebernhardson: learning circle in https://meet.google.com/nqr-ohpq-usa [15:09:17] gehel: doh,forgot :( [15:59:34] workout, back in ~40 [16:08:16] curious... /wmf/data/discovery/cirrusindex is world readable, but /wmf/data/discovery/cirrus/index_without_content is not [16:49:47] back [17:34:52] hmm, while copying over what we did to make rdf world-readable in hadoop to cirrus, i'm left wondering if we have to be careful in any way. This dump contains the text content of all private wikis which isn't exactly public, but hadoop access is already limited [17:36:15] and the data is already readable from elasticsearch directly if you are logged into any hadoop node [17:44:18] ryankemper FYI we're getting errors after the latest change to the data-transfer cookbook: https://phabricator.wikimedia.org/P51433 . Looks like the function we're using might not be getting the time info "for free", so we probably need to look at that sometime [18:21:32] lunch, back in ~40 [18:52:28] * ebernhardson realizes while thinking through these helmfile bits that it's not wholy clear how multi-dc handover works with external inputs (article/draft topic, recommendation create) [18:53:11] while the mediawiki events move between datacenters, i think those all get produced from things that only exist in eqiad [18:54:08] also not having 1 environment (dc) = 1 cluster (aka, cloudelastic) makes this more annoying :P [18:54:41] back [18:57:39] inflatador: you may've already seen this, fork of terraform: https://opentf.org/announcement [19:03:50] ryankemper awesome! I noticed there's already an open fork of terraform itself ( https://github.com/libre-devops/open-terraform ) [19:04:21] hopefully everyone gravitates towards a single fork [19:54:07] i don't know how dumb this is, and some of the args are probably wrong, but here is how i think the streaming updater helmfile could work out: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/951960 [19:55:07] It suggests the method of configuring the updater isn't flexible enough. If the updater read it's config from a yaml file it would fit a bit more naturally into the helmfile ecosystem [19:56:11] pain points are for example how the kafka servers and the kafka consumer id and serialized into a single argument, so they must always be defined together. The arg is actually a generic map that gets passed to kafka, so we couldn't easily centrally define kafka client config [19:56:23] if it were a nested map in yaml it would naturally merge [20:18:45] oooh nice! [20:35:09] ebernhardson: thank you, that looks good! Regarding the configuration: Both, consumer and producer have support for flattened maps. For example, https://people.wikimedia.org/~pfischer/resources/producer.properties, specifies multiple config properties for kafka-sink-config, that are merged as map and forwarded to kafka. [20:36:22] pfischer: oh neet, i'll see if i can have it emit a properties file then [20:41:49] Sure, that’s yet another alternative of passing the config. You could mount a config map as properties file and pass it’s location as the first argument [20:47:10] looks like CI is broken [20:47:22] https://grafana.wikimedia.org/d/000000322/zuul-gearman?orgId=1&viewPanel=21 [20:47:41] I pinged in releng but I don't think anyone's around. Anyone else seeing problems with zuul? [20:51:02] hmm, it worked for me half an hour ago [20:51:13] ohh, yea that looks unhappy :) [20:51:46] from the site, it seems like some big mediawiki patch sets were uploaded [20:54:01] ebernhardson yeah, James-F just posted a mea culpa...that's fine, wanted to make sure it wasn't broken [21:00:30] Regarding kafka, there’s still some work to be done: If we want to leverage flink’s parallel execution (each instance automatically picking only a partition of the messages) we have to make sure, all messages from all sources are keyed properly. Currently, that’s only the case for page_change. That said, currently all topics we intent to subscribe use only one partition. So I wonder if it’s worth the effort to ask [21:00:30] all producers to a) use a consistent key and b) to partition their topics. It should be possible to start with a single partition per topic and only start partitioning when we run into exploding lags.