[07:57:06] inflatador: there are two spikes one at 16:00 and one 18:50, the graphite graph is collecting data from both eqiad & codfw but the prometheus one is reacting to the dashboard variable filters [07:57:34] the two spikes are from two different DC and you see them by switch DC in prometheus [07:59:08] in the percentile dashboard we used thanos to get a single view of the metric, we could do the same here and remove the filters if it's annoying [11:03:04] lunch+errand [12:43:38] there is some stack trace originating from CirrusSearch which is new to 1.44.0-wmf.5 that is being deployed this week [12:43:41] I have filed it as https://phabricator.wikimedia.org/T380862 [12:43:45] it is probably not a big deal [14:01:30] dcausse: no emergency, but I have a counter proposal on https://gerrit.wikimedia.org/r/c/wikimedia-event-utilities/+/1090885 [14:10:09] o/ [14:13:20] hashar: seems like an issue in the parser? [14:13:54] * hashar se clame incompétent [14:14:19] if that is for the parser, I am happy to hand it off to content transformers and poke them :) [14:17:50] hashar: yes I think we should ping them, looking at the stack it seems to be some inconsistencies in the ParserOutput state mTemplates vs mTemplateIds [14:24:37] dcausse: can you comment on the task with some insights? I can't interpret them :) [14:25:10] hashar: sure, will do :) [14:25:23] merci! [15:11:16] \o [15:12:19] .o/ [15:12:21] workout, back in ~40 [15:17:23] o/ [15:29:29] was wondering what bits to monitor to wait for the mw job to be started but I think I'll just copy/paste https://gerrit.wikimedia.org/g/operations/puppet/+/221972775af87c7c118b2bd45b77812f4dc10911/modules/profile/files/kubernetes/deployment_server/mwscript_k8s.py#136 [15:30:40] interesting, i wonder what all can be done with that Watch.stream thing [15:32:17] yes, I started using it to tail logs but I thought that it was keeping a kind of long running http request but maybe not [15:32:48] it seems kind of like subscribing to an even stream for the namespace? [15:32:50] event [15:33:04] that'd be nice [15:33:52] for kubectl exec it's definitely a long running websocket but it's using another stream function not this Watch thing [16:08:25] hmm, i keep waffling around on how to support the opensearch and elasticsearch plugin names. I ended up with a static class that contains the set of aliases, and we have to invoke a method on that instead of in_array( ... ), but feels awkward.. [16:09:35] :/ iirc it's not used in many places so even if not nice it's acceptable? [16:10:41] yea it's just a few maintenance scripts and places [17:55:33] ran some counts to get a better estimate on the number of autocomplete opens per day. Last week was 2.8-4.1M page loads per day submitting at least one autocomplete event. My guess was way higher :S wonder if something is off [18:32:43] dinner [18:56:11] lunch, back in ~40 [19:37:49] doh..just realized i released the plugin as opensearch-analisys-stconvert [19:42:49] sorry, been back awhile [19:49:28] as a weighed tag owner, when I want to say "i have no idea if page XYZ has a recommendation.link weighed tag, but if it has, remove it", what is the most appropriate action? is it OK to produce an `mediawiki.cirrussearch.page_weighted_tags_change.rc0` to remove the tag even if i don't know whether it exists (just that it shouldn't be there)? [19:49:45] or maybe i'm supposed to check somehow if the weighed tag is there or not, and only produce an event if it is there (and an update needs to happen)? [19:52:50] urbanecm: as long as the volume isn't crazy, it seems reasonable to issue an event to clear the tags even if it ends up being a no-op [19:54:45] ebernhardson: could you define "crazy"? Context: Every edit should invalidate any existing `recommendation.link` weighed tag. Would it be reasonable to issue an event for every single page edited, in case the page happens to have the tag on it? [19:57:37] (FWIW, this is what Growth is currently doing. I discovered some bugs in that logic, and I'm wondering whether I should fix them, or whether there is a more preferred way to achieve the same end result) [19:59:48] urbanecm: a few tens of sec should be reasonable, in theory that should also get merged into other edit-related updates and have very minimal cost [19:59:55] urbanecm: a few tens per sec even [20:00:36] oh, okay. good to know. in that case, i'll go fixing the resetting logic. thanks for the advice! [20:05:10] in unrelated news, cindy running opensearch just voted V+1: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/1093978 [20:08:18] kinda would like it to run the suite once for each backend. Could spin up a second instance, but voting is awkward [20:41:29] curiously, the cloudelastic fix-rate fixed itself: https://grafana.wikimedia.org/d/2DIjJ6_nk/cirrussearch-saneitizer-historical-fix-rate?orgId=1 [21:26:13] apparently my memory is failing. The thing i was thinking about that would cause the saneitizer problems w/ cloudelastic, i fixed that almost a year ago :P https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CirrusSearch/+/980955 [21:26:21] so this is some other problem [21:53:54] {◕ ◡ ◕} [21:54:14] Re: Cindy, that's awesome news