[06:50:03] o/ [06:57:50] o/ [10:04:24] lunch [11:27:45] * cormacparle waves [11:28:22] back from hols ... I see https://phabricator.wikimedia.org/T401590 is on your current sprint board. Is there anything I need to do? [11:40:33] cormacparle: o/ no we'll get to it soon I think [13:15:44] \o [13:33:22] o/ [14:09:24] network session re-enabled everywhere [14:10:09] .o/ [14:10:41] o/ [14:18:38] sigh but I see the warn message again in the logs "the session store entry is for an anonymous user, but the session metadata indicates a non-anonymous user" [14:22:36] :S [14:23:13] thats also slightly curious as i thought we didn't have anonymous users anymore (maybe we are still calling them anon?) [14:24:39] yes... I'm not sure I understand... it's coming from a set of wikis, enwiki does not complain for instance [14:25:47] could this be stale sessions somehow? [14:26:22] hmm, i guess plausible? I'm not clear enough on how all that works to know [14:34:25] I should probably be able to answer this myself, but is there any reason why we have 6 cloudelastic hosts as opposed to 5? Maybe disk space? [15:07:44] inflatador: i would have to check exact history, but i think we started with 3 or 4, and even with only 1 replica the machines seemed to be struggling (i forget what metrics), so we bought a couple more machines [15:09:59] hmm, https://phabricator.wikimedia.org/T233720 is where we ordered the 2 additional servers [15:10:26] it mentions "we underestimated the resource requirement" but nothing exact, and no ticket links [15:19:20] * ebernhardson should really get in the habbit of using logging instead of print in python...someday [15:28:06] ebernhardson: why so? does print lack timestamps? [15:28:58] dcausse: I forgot about a hard-coded file reference in the blubber spec that still pointed to flink 1.17.1: https://gitlab.wikimedia.org/repos/search-platform/cirrus-streaming-updater/-/merge_requests/188 [15:29:08] pfischer: looking [15:34:15] pfischer: yea, i was just curious to have timestamps in the output. but it's not a big deal [16:18:13] hmm, weird...`node_load1{cluster="$cluster"}` and similar prom queries don't seem to work w/Elastic clusters...no problem with wdqs though [16:23:06] Don't forget stats homework.. chapter 2 tomorrow [16:23:12] Oh yeah [16:48:27] ebernhardson thanks for the context. I was wondering if we could get away with borrowing a cloudelastic host to test stat host upgrades, but relforge or even a VM would probably work as well [17:00:42] wonder why we give dewiki 3 replicas for _content, even enwiki uses two [17:02:36] finding history in the repo gets annoying, because we kept moving things between files :P [17:09:49] it looks like at some point enwiki and dewiki had 3 replicas, in T318270#8251974 we reduced it to 2 replicas but didn't change dewiki [17:09:50] T318270: Avoid overloading individual Elastic nodes with popular shards - https://phabricator.wikimedia.org/T318270 [17:09:59] going to be bold and change dewiki [17:11:41] looks like it was set to 3 by Chad in 2014 and we've just kept it [17:17:52] :) [17:18:14] dinner [19:49:55] ryankemper I was thinking we could work on T402926 today at pairing, but if you wanna work on the Elasticsearch cookbook or anything else I'm game [19:49:56] T402926: OpenSearch on K8s: implement vm.max_map_count sysctls and any other "important settings" - https://phabricator.wikimedia.org/T402926 [20:46:28] inflatador: yeah let's do that for the first half and switch to spicerack for the second [20:59:55] can't make pairing today [21:00:12] ◉_◉ [21:00:52] usually tuesdays are fine, but i have to do a school run today. schools out 2:15 [21:02:04] all good, I'm having to take my son this week too. For some reason, the bus service doesn't start 'till after Labor Day