[10:55:58] lunch [13:05:44] \o [13:21:08] o/ [13:23:17] started working on 2.x versions of the plugins, so far not terrible [13:36:15] turns out opensearch still does the mildly pointless renames :P like some (all?) of org.opensearch.common moved to org.opensearch.core [13:36:26] sigh... [14:05:21] of course opensearch-py uses urllib directly and ignores REQUESTS_CA_BUNDLE :/ [14:05:39] meh, yea that's annoying [14:06:38] passing ca_certs=os.getenv("REQUESTS_CA_BUNDLE") when creating the client, not particularly great but at least consistent with how we pass the bundle everywhere else [14:06:52] maybe SSL_CERT_FILE? [14:07:10] i've never used it, but this might also be possible: https://pypi.org/project/truststore/ [14:08:24] sure I can try [14:08:48] truststore sounds like a good solution, but does it work like is seems to ? [14:08:52] hopefully [14:16:32] setting env with SSL_CERT_FILE works [14:17:39] but now realizing that testing in a notebook is not going to work... you can easily get opensearch-py in the driver but probably needs a real env to get it deployed to the execurtors [14:18:07] dcausse: wmfdata has a ship_python_env option, although if i've ever !pip installed something usually i have to make a new env to get it to pack happily [14:18:41] i forget what exactly, i think maybe pip installs a new version of something conda installed is what makes it complain [14:18:53] I can try [14:19:37] otherwise I remember that back in the days we were building the venv locally and ran this via spark-submit but I completely forgot how to do that with discolitycs [14:21:28] dcausse: in theory, this is how i used to do it: https://wikitech.wikimedia.org/wiki/User:EBernhardson/pyspark_on_SWAP#Yarn_with_python_dependencies [14:21:54] but indeed probably easier if ship_python_env works [14:23:50] looking, thanks! [14:28:34] meh ofcourse my env is 4g and takes ages to upload :) [14:29:16] lol, yea that's a thing [14:29:30] it also means every executor startup will be slow, not sure if yarn shares that dep between tasks on same host [14:33:21] inflatador: just remembered, if you have a chance can you apply the 64kB read-aheads on eqiad and codfw semsearch clusters? [14:40:58] slow... but seems to be working :) [14:41:23] populating ptwiki in codfw for testing [14:50:08] excellent [14:52:07] surprisingly, have search-extra, hebrew, and stconvert compiling, passing tests, and loading into a 2.19.5 instance. Have to poke through the plugins we load in prod and see what i'm still missing [14:52:16] also not 100% sure they actually work...just passing tests and loading. [14:52:44] nice! tests are usually a good indication for the plugins [14:53:58] spoke too soon "[1:86] [aliases] failed to parse field [actions]", I guess that's what happen when you trust an llm for the opensearch alias update payload :) [14:54:19] yea it likes to guess sometimes [15:00:38] actually was probably my fault :P [15:12:49] sigh.. I think I missed the step about waiting for green after enabling replication... [15:13:38] switching the alias and dropping the index while the new index is yellow is probably not good [15:13:56] lol, yea probably best to wait [15:50:47] shipping frwiki to codfw and will test a diff import after that [15:51:46] ptwiki is 10g bigger than what I had in relforge... 40g instead of 30g for primaries [15:54:35] school run [16:18:23] ebernhardson: did you run the connector setup in codfw? getting Pipeline qwen3-embedding is not defined [16:24:48] dcausse: oh, maybe not. sec [16:24:55] np! [16:25:50] dcausse: should all be created now [16:25:54] thanks! [16:26:30] sigh... I don't get the 10g size diff ... [16:26:52] certainly curious [16:27:31] added two new boolean fields but I doubt that could count for 10g... [16:27:38] i sure hope not :) [16:46:41] twice more segments in codfw than relforge... [16:47:28] used 3 workers to import to codfw vs 1 curl in relforge [16:57:15] fr is 109g vs 94g (codfw vs eqiad) [17:04:04] how odd [17:06:46] very weird... down to 106g after a diff import which has deleted docs (should be theoretically bigger)... guessing it's due to how segments are optimized? [17:07:42] and timed out calling _refresh, should have tuned the default timeouts [17:11:56] I suppose i don't even know if the knn storage is linear in size or how that works [17:12:39] no clue... [17:31:10] pushing ptwiki to eqiad [17:52:19] ok done, will push enwiki early next week