[00:04:01] still need to make this patches more reasonable, and write some tests, but this at least passes the current suite now: https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/commits/work/ebernhardson/weighted-tags [09:59:16] lunch [10:03:34] Lunch 2 [12:13:33] @team: I'll be late for our office hours, conflicting meeting. I'm sure you can handle it without me! [13:51:59] o/ - it's me again just wanted to hear what the latest thinking is about moving away from ElasticSearch; are you intending to move cirrus to use opensearch? Is there some ticket I can't find that I should follow [13:54:15] Mostly interested because the helm charts we were using for old-ish ElasticSearch are now finally incompatible with the version of kubernetes we run so we're trying to figure out where we move to next [13:54:54] i.e. how much investment we make in keeping ES 7.10 a permanent thing for us [14:00:26] tarrow: no decision at this point, the ticket to follow should be T272111 and possibly T280482 [14:00:27] T280482: Validate that OpenSearch is a viable replacement for Elasticsearch for CirrusSearch - https://phabricator.wikimedia.org/T280482 [14:00:27] T272111: Elasticsearch, a CirrusSearch dependency, is switching to SSPL/Custom licence - https://phabricator.wikimedia.org/T272111 [14:00:53] tarrow: if you'll be around in an hour, we'll be having office hours—but like dcausse said while I was typing, no decision yet. https://etherpad.wikimedia.org/p/Search_Platform_Office_Hours [14:04:08] inflatador: guessing that you're not around, I'm canceling the pairing session but ping me if I'm wrong [14:04:10] dcausse: cool; I'd dug up those! Sounds like you expect to be on 7.10 for "a while"? [14:04:50] tarrow: yes, moving to opensearch won't happen in the next few months [14:04:59] Trey314159: Thanks! I thought it might be today; I'm expecting to be "done with work" by then and seeing some daylight but if not I'll drop by :) [14:06:44] dcausse: did you have any cases where there were CVEs in 7.10 (this is still what we run right?) but no published fixes since elastic.co seem to have washes their hands of it? [14:10:58] tarrow: not really, at least none of the vulnerability were actually impacting us (last might have been the log4j issues https://discuss.elastic.co/t/apache-log4j2-remote-code-execution-rce-vulnerability-cve-2021-44228-esa-2021-31/291476/1) but we applied a workaround on our setup [14:16:54] cool; it was definitely a bit of a wake up call to us find that the upstream helm charts are now totally defunct. [14:17:43] but we are also debating trying https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s_learn_more_about_eck.html which offers a 7.x version (under a SSPL :/) [14:19:34] if you're OK with the license I guess this could work, regarding ES version compatibility we never tested something above 7.10.2 but I'm sure some third parties install have something more recent [14:29:41] hm... but using wikibase you might have installed the wmf plugins (search-extra & highlighter) for which I don't think we have a compatible version for more recent ES versions [14:47:40] how essential are these plugins? We've always been using them but we struggled to really understand the user impact of not having them [14:52:38] tarrow: WikibaseCirrusSearch would not work without them, building them for a newer es version might be doable tho [15:31:58] Planned power outage rn, will hop in meeting(s) when it resolves [16:29:17] ebernhardson: How are you coming along with refactoring the InputEvent? Just would like to avoid major merge efforts, if possible. [16:36:41] pfischer: talking to david about it now in https://meet.google.com/vgj-bbeb-uyi [16:40:08] dcausse Sorry about that, I thought I declined already. [16:40:15] inflatador: no worries! [17:07:33] ebernhardson: sorry, I have to prepare some food first [17:08:50] pfischer: no worries. The short answer is i have the existing work up on https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/commits/work/ebernhardson/weighted-tags but there are still some awkward edges to work out, and a little bit of splitting the patches up to make it more reviewable [18:46:54] for the data-transfer cookbook, it bails if the file already exists on dest. Is that OK, or should be add a "force" option? I can't decide if it's worth it [18:53:05] inflatador: my $.02: should add a force option, or possibly even make it overwrite by default (perhaps prompting operator for confirmation before proceeding) [18:56:32] ryankemper ACK, we can work on that at pairing. It's not native to the transferpy library but still should be easy to implement [19:10:46] internet is a bit flaky here [19:47:45] Also kinda wondering about the options for data-transfer...is there a scenario in which we'd want to only xfer categories or blazegraph, but **not** both? [19:48:00] er....delete that "only" [20:03:58] inflatador: wcqs does not have categories [20:05:44] ACK, I was not going to touch the 'commons' part. Maybe default to xferring both categories and blazegraph if the user gives the 'wikidata' option? [20:07:40] categories and wdqs are two different services so it makes sense to allow transfering them separately, as for the default I guess it's up to you to ponder in what conditions we're going to use script the most, if for provisioning new machines most of the time then transfering both by default makes sensee [20:08:05] if it's for the transfers following a data-reload then no [20:08:58] maybe we could allow specifying more than one instance at the command-line or something? Hmm [20:09:20] mostly thinking about provisioning new machines, y [20:09:34] easy to forget transferring categories in that scenario [20:12:49] transferring with encryption is working DC-to-DC, about to try cross-DC [21:07:20] cool! [22:05:32] quick break, back in ~20 [22:48:34] sorry, been back