[10:33:36] pfischer, Joal: connection issues, trying to connect back [10:49:56] errand + lunch [11:44:09] lunch [13:59:43] Just merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/974281 so we might get some alerts. Not exactly sure what 'severity:task' means...maybe it will automatically create phab task? Will monitor [14:32:53] o/ [14:44:30] lexemes reload finished on wdqs1022 ... I updated https://phabricator.wikimedia.org/T347504 but LMK if there's anything I can do [14:46:35] inflatador: thanks! [14:49:12] np [15:33:31] inflatador: it should make a task, i configured that here: https://gerrit.wikimedia.org/r/c/operations/puppet/+/769131/4/modules/alertmanager/templates/alertmanager.yml.erb [15:33:40] i don't remember how or why that particular webhook string works :P [15:46:05] @team: Have I missed something important in our team update: https://docs.google.com/presentation/d/1Hi3hZCD7BLPRqPPdHvdxShO7iLTIykROf5AJLFnEBkU/edit#slide=id.g29cc179a43e_0_300 [15:47:19] * ebernhardson is shocked to see all the flink bits running with a 4d+ age [16:12:47] ebernhardson: that’s great news! A post-thanks-giving gift. ;-) [16:15:14] ;) [17:15:27] https://gerrit.wikimedia.org/r/c/operations/puppet/+/977704 is up for the LDF endpoint monitoring...previous patch failed due to duplicate resources [18:14:08] lunch, back in ~40 [18:24:05] Anything i shouldn't forget to include in the elasticsearch deep dive tomorrow? Putting together an outline [18:30:40] randomly curious (and probably not particularly useful :P) the developer that made most of the early lsearchd contributions now turns up as an author (out of 20 or 30+) of the llama 2 paper from meta [19:20:55] back [19:29:17] dinner [19:34:32] ebernhardson thanks for the Puppet link. I just merged above patch for LDF monitoring...keeping an eye on the discovery-search board to see if we get any alert tasks [19:45:27] getting some lunch [20:39:25] back [21:01:02] spent some time on friday exploring the elastic connector to allow plugin a "response handler" (https://github.com/nomoa/flink-connector-elasticsearch/tree/bulk-item-tesponse-handler) [21:02:17] not really tested nor something I'm particularly happy with... tracking the number of individual retries seems particularly tedious and found no ways to plug a side-output at this stage :/ [21:03:58] dcausse: thats still pretty cool! I pondered it briefly but wasn't sure how to go about it. [21:04:22] and yea the way that pending actions is a simple counter seems awkward [21:04:28] nice [21:09:36] I think it'd be more correct to have a side-output for individual retries (retrying in the sink might treat events out-of-order, possibly erasing newer data I think?) [21:10:35] well we still have the super_detect_noop version check so perhaps acceptable... [21:20:41] Another PR for LDF monitoring if anyone has time to look https://gerrit.wikimedia.org/r/c/operations/puppet/+/977787 [22:32:45] * ebernhardson wonders if the Ltr query still needs a +10000 on it's score. I think we put that there because the model could emit negative numbers and we wanted to make sure -5 was still ranked above the other things