[09:05:49] <dcausse>	 we're discussing a place to put notebooks, for now it's mainly the ones related to the wdqs graph split. Tempted to create a very generic "notebooks" project under gitlab:repos/search-platform, any objections to this?
[09:08:58] <gehel>	 No objections as a starting point. If it becomes too messy, we can always reorganize.
[09:39:43] <dcausse>	 sonar started to complain on the sup java code: https://sonarcloud.io/project/issues?resolved=false&severities=BLOCKER%2CCRITICAL%2CMAJOR%2CMINOR&sinceLeakPeriod=true&types=BUG&id=wmftest_cirrus-streaming-updater&open=AY0YOIi340DOfg35nq6V
[09:39:57] <dcausse>	 seems like a false positive but could be me missing something
[11:42:08] <dcausse>	 lunch
[13:59:44] <godog>	 FYI this graphite-based alert has been UNKNOWN for the last 22 days https://alerts.wikimedia.org/?q=%40state%3Dactive&q=%40cluster%3Dwikimedia.org&q=alertname%3DNumber%20of%20requests%20triggering%20circuit%20breakers%20due%20to%20excessive%20memory%20usage
[14:00:12] <godog>	 looks like the underlying metric either disappeared or was renamed
[14:01:23] <inflatador>	 godog thanks for the heads-up, will make a task to investigate
[14:01:46] <godog>	 inflatador: thank you! appreciate it
[14:04:45] <inflatador>	 np, T355795 is up. Feel free to add/edit if I missed anything
[14:04:45] <stashbot>	 T355795: Fix "requests triggering circuit breakers" Elastic alert - https://phabricator.wikimedia.org/T355795
[14:05:25] <godog>	 looks good!
[14:15:16] <gehel>	 inflatador: should we keep our 1:1 today?
[14:15:42] <inflatador>	 gehel I don't think we need to
[14:16:28] <inflatador>	 <o/
[14:17:35] <gehel>	 dcausse: looks like a false positive to me as well. Seems that the synthetic execution is getting throw off by the exceptions
[14:18:29] <gehel>	 I would probably extract the retry logic to its own class, which would make things more clear, both for humans and machines...
[14:21:15] <gehel>	 inflatador: ok, let's canccel for today
[14:22:18] <dcausse>	 o/
[14:48:12] <gehel>	 dcausse: would you be up to presenting the status of the WDQS Graph SPlit for our next DPE staff meeting? (January 29th)
[14:48:33] <dcausse>	 gehel: sure
[14:48:34] <gehel>	 just a slide and few minutes, not a large presentation
[16:03:21] <dr0ptp4kt>	 OTW, just got into cowork space building
[16:57:05] <inflatador>	 workout, back in ~40
[17:50:04] <inflatador>	 back
[18:05:07] <dr0ptp4kt>	 meant to ask, d.causse if group perms could be set on T352538_wdqs_graph_split_eval in hdfs - i'm looking at regexing things for iguana to pre-filter non-useful types of queries (non-useful for analysis purposes that make its way of thinking about performance skew), but also taking under consideration where queries succeed in both consolidated graph and just the main-side graph already and have the same result count.
[18:07:13] * dr0ptp4kt (btw, am looking at the notebook to see if there are some already-accessible paths that may already have the stuff, plus to avoid dead-ends)
[18:41:03] <inflatador>	 lunch, back in ~40
[18:52:24] <dcausse>	 dr0ptp4kt: I've run hdfs dfs -chgrp -R analytics-search-users hdfs:///user/dcausse/T352538_wdqs_graph_split_eval hopefully that's enough?
[18:52:27] <dcausse>	 dinner
[18:53:45] <dr0ptp4kt>	 looks good, thanks!
[19:11:02] <inflatador>	 back
[19:14:57] <ebernhardson>	 hmm, a deploy on SUP will change the consumer envoy from 1 cpu to 200m cpu request, assuming that was done with --set '...' but not sure which values.  Thought it would be `mesh.resources.requests.cpu=1` but doesn't seem to do anything
[19:25:40] <inflatador>	 weird. I was thinking we were the only ones adding more resources to envoy, but that's not true is it?
[19:27:41] <ebernhardson>	 i think it's trying to change back to the defaults from a custom adjustment. I'm going to put together a patch to align the git repo with whats currently deployed
[19:35:36] <ebernhardson>	 ahh, 2 comments above was partially PEBKAC.  The reason --set didn't work is because i was diffing producer and consumer, but only one is changed, so the diff was actually changing i just didn't notice that what changed changed
[19:35:49] <ebernhardson>	 so it was working, but the diff was always there because something was changing
[20:42:52] <inflatador>	 hmm, we got another WDQS alert about the streaming updater using too much object storage space
[21:14:20] <inflatador>	 that one should definitely create a task instead
[21:46:27] <ebernhardson>	 seems sensible
[22:39:27] <ebernhardson>	 hmm, cloudelastic red
[22:51:15] <inflatador>	 ebernhardson Y we merged the master change, having issues with order of operations
[22:51:39] <inflatador>	 we;re in https://meet.google.com/fde-tbpf-wqh if you wanna join