[10:14:51] lunch [13:00:42] \o [13:00:45] o/ [13:02:21] still seeing couple writes failing with "Map failed: MemorySegmentIndexInput this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 860 bytes" [13:02:31] :S [13:04:36] 295992 maps in /proc/2347008/maps (cloudelastic1008) [13:09:56] ah might be cloudelastic1012, still has "vm.max_map_count = 262144", it was re-imaged but I hacked an annoying override from the opensearch package to overrides system 1M defaults for this sysctl entry [13:12:04] oh, ok that would make sense. I was poking around but not yet finding anything interesting [13:12:37] the error message seems odd because virtual address space is like 2^48. Maybe there are restrictions in how linux breaks that into different use cases, but still. 2^48 is huge. [13:13:16] I think it might fail on the number of maps, here it's getting close to the 262144 limit on cloudelastic1012 [13:13:58] yea that's probably it [13:15:18] Sorry, I meant to bring that one up upstream and forgot. Will make a ticket for it [13:15:31] updated manually but it'll strike us again, Guillaume had something in puppet [13:15:34] inflatador: thanks! [13:15:35] sounds like we also need a Puppet patch to set that [13:16:04] yeah, let me double-check. I think he merged the patch, but we might need to set a hiera value somewhere or something [13:16:16] one is still required: https://gerrit.wikimedia.org/r/c/operations/puppet/+/1282320 [13:16:48] ah, and arrow alignment failure. Finally something within my skill level ;) [13:18:25] wondering if we should monitor the number of maps, seeing a bunch of maps on deleted files... afraid of a possible leak :( [13:19:28] Yeah, that's a good idea. I wonder if we have anything in node exporter or our elastic exporters that keeps track of that [13:19:40] https://github.com/apache/lucene/issues/15068 [13:20:02] inflatador: I don't think opensearch does know that, might have to be at the os level [13:21:02] we have prometheus-varnishd_mmap_count in puppet that seems to measure that [13:21:28] dcausse interesting. Is there any downside to setting it to a stupidly high level, like 500 billion or something? [13:22:22] I have no clue, I guess it's stored in mem in the kernel space so not something that can grow indefinitely [13:23:34] hmm, not really sure. it feels like something where the kernel thought most things use very little so set a low limit and catch runaways? But certain users like lucene found a way to use a bunch [13:23:48] but i suppose i've never looked into it really [13:24:25] googling around, apparently the steam deck (handheld linux gaming machine) sets it to 2^32-1 [13:24:40] wow [14:43:46] created T425681 to work on the mmap stuff. I'm pretty sure we'll need to fix it in our K8s installs too [14:43:46] T425681: Ensure`vm.max_map_count` count is set to an adequately high number in our OpenSearch environments - https://phabricator.wikimedia.org/T425681 [14:44:14] thanks! [15:05:47] silenced the fetch error alerts from cloudelastic for 7days (it's still backfilling) [15:06:02] thanks [18:35:45] may have found a way through the regex problem, but not 100% sure. This is applying "A simple, Fast Dominance Algorithm" to find the dominant path through a graph. In theory this is a formal "all-paths proof": https://phabricator.wikimedia.org/P92421 [18:36:16] paste is graphviz for the result with the DFA in gray and the dominant path in red [18:37:39] apparently this is what compilers do (i don't know why :P) [18:40:08] ebernhardson: sounds good! are you around for our 1:1? [18:40:13] pfischer: doh! yes [20:11:07] ebernhardson: Very cool, and it looks much more reasonable. I added some comments in phab. Did you just code it up yourself, or is there an implementation we can play with? (I love playing around with graphviz!) [20:13:05] Trey314159: there are public implementations, but this is one i worked up (with the help of claude). The most prominant public implementations though are inside compilers, apparently they use this over flow-control graphs (i don't know why exactly, something to do with optimization passes) [20:13:22] but the same algo is in java, go, and probably more [20:13:34] (but i can't access the java impl directly, it's buried in the compiler :P [20:15:55] the part i'm trying to work out now is alternation, i pasted another graph into the comments for `(abc|def)xyz` [20:16:02] can easily pull other graphs [20:17:20] i'm not 100% sure how to know i need to include the `c` and `f` since it's not in the dominance path, but getting there [20:18:00] i'm hoping the single exit arc is enough to detect [20:25:24] Very odd about `c` and `f` and the end-run from 0 to 5. Any chance Claude made a mistake there? [20:27:14] Trey314159: from my reading i'm pretty sure that's correct, Since state 5 has multiple incoming paths neither can be the dominant path [20:27:37] Interesting.. I will have to look at the paper more carefully [20:37:58] * ebernhardson just hit alt-f4 instead of alt-4 ... [22:07:27] getting somewhere...generating appropriate expressions for `(abc|def)xyz` and `Clover.*West`. But now i need to write proper tests and not simply two one-off expressions