[07:53:04] dcausse: i wont be around this morning for my meeting [07:53:19] ejoseph: no worries [09:56:56] lunch [10:00:57] Lunch 2 [13:06:34] gehel: shari would like me to be at the SD product deep dive meeting today so I won't make planning. Can you handle it today? [13:06:51] mpham: sure! [13:06:56] thanks! [13:07:18] I'll remind everyone during triage, but read my email about t-shirts and add your size in https://docs.google.com/spreadsheets/d/124rtSForgT6u-7CAzvwp8lzhmPc8rjnjcXeBNl-xHrY/edit#gid=0 [13:14:58] greetings [13:19:21] Hello everyone i just got back home [13:21:17] welcome back! Were you at the farm? [13:28:28] Nope [13:28:47] I went to get my second covid vaccine dose [13:31:16] Oh nice! How are you feeling? [13:40:15] o/ [13:41:51] pain around my arms mostly [13:42:05] That's good! I had a few days of aches/fever after my second dose [14:12:02] inflatador: we're in the k8s chat if you want to join: https://meet.google.com/wjt-srdx-cgq [15:01:20] ryankemper, inflatador, Trey314159: triage: https://meet.google.com/eki-rafx-cxi [15:01:50] on my way [15:20:42] sorry for being late, something weird happened with my notifications [15:39:21] inflatador: hello [15:42:01] mpham: we have a ticket for the Search token on mobile: T308288. I'm not sure how to tag it so that the web team sees it. [15:42:01] T308288: Implement search token clickthrough tracking on autocomplete for mobile web - https://phabricator.wikimedia.org/T308288 [15:42:44] gehel: i'll get it to Olga [15:42:49] thanks! [15:43:00] inflatador [15:43:40] oops: https://neilmadden.blog/2022/04/19/psychic-signatures-in-java/ [15:44:17] for once it's good to not be up to date! [15:44:55] (╯°□°)╯︵ ┻━┻ [15:45:06] Mention me again gehel ? I think it's fixed [15:45:14] inflatador: o/ [15:45:19] OK fixed [15:45:28] what was it? [15:45:55] When I turned on highlights for other words, my client stopped highlighting name mentions [15:46:09] Stupid behavior, but I manually added my handle to highlights [16:09:59] ejoseph: refactoring session? [16:15:01] * ebernhardson wonders how hard it will be to get all the appropriate source jars in place to use `jdb` on wdqs1009 [16:32:38] I'd perhaps try to expose the debugging port and use it through a ssh tunnel, so that you keep the source code on your local machine [16:33:59] ahh, yea port forwarding might work better. I'm trying to figure out why after excluding all the extra logback implementations (we seem to put the implementation in every jar?) now i get 0 logs :) [16:34:14] i can tell with jdb and no sources that it's still loading logback, but not sure why logback silences everything [16:35:40] :/ [16:36:12] classloading related issues are always a pain [16:56:55] I'm sorry guys, I slept off during the meeting [16:57:00] I just woke up [16:57:19] ejoseph: no worries, happens to the best of us [16:57:54] ebernhardson: I'll reschedule if it's fine [16:57:59] ejoseph: sure [17:11:35] curious. Stepping through the slf4j initialization i get to ` throw new UnsupportedOperationException("This code should never make it into slf4j-api.jar");` [17:12:12] maybe jdb is showing something wrong though [17:39:45] * ebernhardson is not very smart. The answer is root logger in /etc/wdqs/logback-wdqs-blackgraph.xml is at WARN instead of INFO [17:39:56] wow, s/blackgraph/blazegraph/ [17:49:59] lunch, back in ~45m [17:52:33] hm... only a couple tests failing before my patch, couple dozen failing after... not there yet, hopefully it's something small [17:52:35] dinner [18:28:52] gehel: running 5’ late [18:31:42] ryankemper: ack [18:36:20] initial AB test results for wbsearchentities, slightly worse everywhere :S [18:38:40] back [19:05:14] FYI, shutting down elastic06 and 07 in deployment-prep, will decom Friday if no screams [19:05:26] (elastic05 has been shut down since last wk) [20:24:44] quick break, back in ~15-20 [20:48:04] back [21:29:10] Trey314159 or ebernhardson , any objections with reimaging relforge? [21:29:27] none from me [21:52:20] ebernhardson probably should have asked you beforehand, but we got some puppet alerts on wdqs1009 and ran puppet agent. Apologies if that wiped out something you were working on [21:57:20] inflatador: oh, i wonder if those alerts are because i just ran `enable-puppet` after having it disabled a bit, i guess i should run puppet manually at the same time? [21:58:14] yeah, I think you're right. No worries though [22:20:49] While running this query: https://w.wiki/5At2 on my own copy of Wikidata in Blazegraph, I came across some kind of "out of memory" error: https://gist.github.com/harej/43bdf1478ff95e105dc697f755dee62a [22:21:42] Which was not expected because (1) I defined the stack size as 128 GB and (2) I have plenty of unused RAM [22:28:08] hare: that mentions a blazegraph specific MemoryManager, instead of the overall java heap (is that whats set to 128GB, typically via -Xmx?). I'd suspect there is a separate configuration knob for that one [22:28:29] checking if there is anything obvious [22:28:33] -Xmx that is right. Here is the full command: [22:29:03] hare: oh interesting, the MemoryManager class it refers to manages off-heap memory [22:29:41] What sort of off-heap memory? [22:31:13] hare: off-heap basically means it has it's own memory independant of the 128GB given to the jvm [22:32:43] (i don't know whats in it :P) [22:34:56] Do you know where/how it is configured? [22:36:15] hmm, this class is used in a few places so without being more familiar it's hard to say what would change the limits there. There looks to be per-query max memory query hint with `analyticMaxMemoryPerQuery` but no clue if that would be the one failing here [22:37:03] it seems 0 is a valid value as well (which makes the query block until it can get enough memory, assuming other queries currently have it) [22:37:27] I am pretty sure no other queries are being run on this server [22:48:20] hare: you could try providing something like -XX:MaxDirectMemorySize=128000m, but i'm not sure what the appropriate values would be. Upstream suggested to someone trying to use our data to put 128GB heap and 128GB direct memory [22:48:51] I'm not seeing that set on our production instances anywhere though, and jvm docs suggest there is no hard limit if it's not configured so might not do anything useful [22:53:04] related thread: [22:53:05] https://sourceforge.net/p/bigdata/discussion/676946/thread/8ad56b4d45/?limit=25