[08:55:53] o/ dcausse: Erik ran into yet another issue with the ES connector, this time, it’s about the ES client library: The size calculation/estimation of bulk-actions does not work as expected for actions involving large script-parameter values, which is what we do when passing the whole document as parameter value. Now I was wondering what to do: 1) Evaluate if we can always use the `upsert` property instead of a script [08:55:53] parameter (and adapt the extra plugin accordingly) or 2) Write an upstream patch that includes the parameters in the size estimation. [08:59:35] pfischer: o/, yes was just reading the backscroll... it's a bit unfortunate :/, at first glance it seems like option 2 is the right way but haven't dug into the details so can't really judge, is this estimation problem happening in the flink-connector or the elastic client? [09:00:55] That’s part of the elastic client `org.elasticsearch.action.bulk.BulkRequest#internalAdd(org.elasticsearch.action.update.UpdateRequest)` line 192 [09:02:40] Another approach like 1) could be to use the script `source` property that is currently set to the same value as script `language`: `super_detect_noop`. After all, that is considered in the size estimation. [09:04:52] So in ``org.wikimedia.discovery.cirrus.updater.consumer.graph.UpdateElasticsearchEmitter#createUpdateRequest` line 133 we could pass the document as `idOrCode` parameter. [09:07:13] but won't it double the actual request size making the estimation still be incorrect? [09:07:49] No, if we no longer pass the doc as script parameter but only as `idOrCode` we’re fine. [09:08:31] oh you mean changing how the script params are handled [09:08:47] We already pass the doc twice (in case we allow an upsert), but the `upsert` field is correctly taken into consideration when calculating the size [09:09:01] - yes, exactly [09:09:58] The extra plugin would have to support both ways of passing the doc, as long as cirrus is writing to it (and has not been adapted) [09:12:18] But you are right, the `idOrCode.length()` is multiplied by 2 so we (the ES client) systematically over-estimates the script size [09:13:33] overestimating is perhaps acceptable? [09:13:54] Presumably, yes. [09:13:57] looking at the codebase to refresh my mind [09:14:24] Sure, thanks, sorry for throwing that at you on a Monday morning. [09:15:22] np! :) [09:16:36] the *2 is probably to take into account the overhead for non-acii chars in the script perhaps? [09:17:34] so I don't see why it would not work, one drawback is that it has to be a string, so enconfing json inside a json string [09:18:06] Yes, that’s true. Looking into the option to intercept BulkRequest creation to override the size estimation but doesn’t look like that’s an option [09:21:50] actually we might just be able to pass an unencoded json bloc [09:22:53] What do you mean by unencoded? [09:23:02] https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/script/Script.java#L153 [09:23:53] Ah, so ES parses as JSON if applicable, cool [09:23:53] it seems like the Script is able to parse the field as if it was a nested json [09:24:44] Hm, the plugin API only get’s passed a String `scriptSource` [09:24:51] ah be that does not change much since we use elastic classes to create the request [09:27:55] yes that's not applicable to us sadly, sorry. Could have worked perhaps to save some bytes in the transfer if we had built the update request ourselves [09:30:51] pfischer: I think your idea is correctwe could still file a bug upstream regarding this estimate [09:30:54] oops [09:31:08] pressed enter too early [09:31:51] Alright, thanks for looking into it. I’ll extend the plugin then. [09:32:48] I meant that we could try to file a bug upstream to see if they consider this a bug or not, but feels like that if we change how we handle params we might not go back to this "source" script param [10:21:12] dcausse: I'm lost with T249989. I vaguely remember that you looked into it and that the same issue was happening on regular wikis, but it's not in the task (or at least not clearly). [10:21:13] T249989: Disable native OpenSearch suggestions in Wikibase wikis - https://phabricator.wikimedia.org/T249989 [10:21:46] dcausse: could you have another look and see if we can add enough context so that it is more clear where to route this task? [10:23:46] gehel: looking [11:15:09] lunch [14:12:15] dcausse: thank you for the review, I followed your comments [14:12:41] pfischer: thanks! will take another look shortly [14:45:43] pfischer: hello :) You got added as a reviewer to my change to the parent-pom "Move checkstyle:check from test to validate phase" | https://gerrit.wikimedia.org/r/c/wikimedia/discovery/discovery-parent-pom/+/979322 :) [14:46:12] that should cause builds to fail earlier when there is a style issue [14:46:53] given you got added last, I guess it you are tasked in approving that change [16:01:22] pfischer: triage: https://meet.google.com/eki-rafx-cxi [16:35:15] dcausse: do we have a place for the super_dectec_noop API spec? The extra plugin’s README doesn’t contain usage instructions so far. [16:36:15] Ah, found it. [16:45:48] .dongers [16:46:03] sorry, macro misfire ;) [16:48:10] o/ [16:57:49] o/ [16:58:56] I’m out for today. [17:00:32] regarding certs, flink and cloudelastic, since we use service mesh we might have to bump the flink-app chart with the new vendor mesh config template from https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/981341 [17:03:33] dcausse ACK, I guess we need to wait for that one to get merged first? [17:03:52] inflatador: yes [17:08:50] I wonder if there's a kink of debmonitor for images in our docker-registry [17:08:57] s/kink/kind [17:09:09] dcausse: debmonitor has also docker images [17:09:15] oh? [17:09:51] https://debmonitor.wikimedia.org/images/ [17:11:06] nice [17:11:20] volans: nice, thanks!! [17:11:27] no prob :) [17:13:38] !issync [17:13:38] Syncing #wikimedia-search (requested by gehel) [17:13:40] Set /cs flags #wikimedia-search zpapierski -AVfiortv [17:13:42] Set /cs flags #wikimedia-search dr0ptp4kt +AVfiortv [17:15:17] so the flink image itself does not have wmf-certificates so it might not allow accessing cloudelastic directly but if we use envoy we might not need that [17:22:29] hm... can't find https://docker-registry.wikimedia.org/repos/search-platform/cirrus-streaming-updater/producer/tags/ in debmonitor, perhaps it does only index the "latest" tag? [17:28:31] updated T352335 [17:28:32] T352335: Deploy the new Cirrus Updater to update select wikis in cloudelastic - https://phabricator.wikimedia.org/T352335 [17:28:35] dinner [18:39:07] lunch, back in ~40 [19:07:49] back [19:31:51] inflatador: do you need me for any more ISS syncs? [19:34:35] gehel no, I think we're good. Thanks for your help [19:35:29] Just found this one by accident...looks like the wikidata query webUI stuff is going to k8s sooner or later https://phabricator.wikimedia.org/T350793 [19:42:48] https://gerrit.wikimedia.org/r/c/operations/puppet/+/982138/ yet another LDF CR for review if anyone has time [19:53:33] appointment, back in ~90 [21:28:12] back [23:04:55] Are we expecting nginx to log to elastic hosts? The access.log files are empty