[14:02:39] \o [15:03:10] streaming updater is stuck, minimal output. I think it's due to some maintenance being done in prod so waiting [15:56:51] (i think stuck means messages not coming in) [15:58:18] ebernhardson check mediawiki_security if you're in there [15:59:18] ya, i saw [16:02:53] 👍 [17:00:41] lunch, back in ~40 [17:27:19] inflatador: when you get back, can we apply the 64kb readahead to opensearch-semantic-search again? Hopefully last time around, just trying to collect some data on even lower memory:index ratios [17:29:25] streaming updater came back with the removal of read-only, no action on our part [17:48:36] * ebernhardson is looking into the deepcat ticket about unexpected results...kinda wish we had a nice way of exposing the raw categories query data. [18:16:09] i'm almost certain the answer is it's about category depth limits...but not sure how to present that to the user [18:21:48] back [18:24:56] Github showed me a "semantic search (beta)" thingie today [18:27:13] ebernhardson readahead applied [18:27:18] inflatador: thanks! [18:28:27] np, let me know how it goes [18:54:18] i think the deepcat problem was not actually a problem, more a misunderstanding of the user in which categories they were excluding. No clue how terrible it will be, but curious what comes out. I've asked claude to write a self-contained .html file that would fetch two deepcat &cirrusQueryDumps and show how they differ [18:54:48] no way i would have time to put that together myself, but maybe it comes up with something reasonable [18:57:55] Sounds like a good use case. I am curious too. How much context (in terms of folders/sources/docs) did you give claude? [19:00:21] i gave it one the json output of one &cirrusDumpQuery. It did come up with something that shows the diff, the include/exclude for each query, and the intersection, but unfortunately it doesn't really help clarify the problem the user had [19:13:05] :q [20:03:37] maybe a bit hyper-specialized to this particular report, but interesting: [20:03:38] https://people.wikimedia.org/~ebernhardson/deepcat-compare.html?a=-deepcategory%3A%22Maps+of+the+world+by+language%22+deepcategory%3A%222020s+maps+of+the+world%22+-deepcategory%3A%22Wikimania+Map+of+the+world%22&b=-deepcategory%3A%22Maps+of+the+world+by+language%22+deepcategory%3A%222020s+maps+of+the+world%22+-deepcategory%3A%22Wikimania+Map+of+the+world%22+-deepcategory%3A%22SVG+maps+of+ [20:03:40] the+world+by+language%22+-deepcategory%3A%22English-language+SVG+maps%22+-deepcategory%3A%22Our+World+in+Data+food+and+agriculture+maps+of+the+world%22 [20:03:46] link too long :( [20:07:40] example queries here: https://phabricator.wikimedia.org/T415299#11679002 [20:07:45] !issync [20:07:45] Syncing #wikimedia-search (requested by JJMC89) [20:07:46] No updates for #wikimedia-search [20:09:27] * ebernhardson is actually mildly impressed with that result [20:44:03] hmm, opensearch-semantic-search is getting EOF's from inference.discovery.wmnet [20:45:35] Received error from remote service with status code 502, response headers: {content-length=[53], content-type=[text/plain; charset=utf-8], date=[Thu, 05 Mar 2026 20:37:12 GMT], server=[istio-envoy], x-content-type-options=[nosniff], x-envoy-upstream-service-time=[1]} [20:45:48] opensearch-semantic-search-masters-0] Remote service returned error: Error from remote service: dial tcp 127.0.0.1:8080: connect: connection refused with status: BAD_GATEWAY [21:40:28] pfischer: semantic search should be deployed, but due to ^^ i can't verify if it's working. The train did roll forward so in theory if kserve starts responding it should work. Probably. Filing a ticket for the kserve proble [22:07:07] filed https://phabricator.wikimedia.org/T419174 [22:55:34] sigh..checking things out there is a part that didn't occur to me, interwiki search on frwiki means semantic wont work (yet) on Special:Search, has an issue loading the profile cross-wiki. API will still be fine though [22:58:20] sigh, actually it wont. I forgot that the opensearch clusters requires auth to allow ml predictions ... [23:01:57] * ebernhardson puzzles around in the opensearch security settings... [23:08:55] ok, created an ml_predict_role in _plugins/_security/api/roles/ml_predict_role, and then a rolemapping at _plugins/_security/api/rolesmapping/ml_predict_role with the end result of assigning cluster:admin/opensearch/ml/predict to opendistro_security_anonymous_backendrole [23:11:47] running out of time today, but i'll figure out how to get that into the cirrus-toolbox opensearch_config.py tomorrow [23:16:47] alternatively if we care we should be able to create a proper user, but i guess just out of momentum from how things already work...i'm fine with not having cirrus auth with the cluster.