[00:01:56] this is all so odd...no answers yet :( strace/tcpdump show that nginx is making two requests to /bigdata/... for every one request to /readiness-probe, one of those doesn't include the query string and nginx just closes it, leaving jetty to EOF [00:02:21] nginx has debug loging...but it's dense :P [08:58:24] ejoseph: I'll be 3 minutes late for our meeting... [09:51:44] zpapierski: when will you be ready [09:54:09] ejoseph: I need to take care of something first, I should be around 11:45 [10:26:58] dcausse: I was pinged again by Observability on T289077. I think we've done some of that as part of WCQS and streaming updater. cc: ryankemper [10:26:59] T289077: Migrate Search team's prometheus-based alerts from Icinga to alert-manager - https://phabricator.wikimedia.org/T289077 [10:43:41] gehel: should not be too hard as long as everything is in prometheus [10:44:41] We've been talking about making this happen since end of August :/ [10:44:51] Let's discuss it on Monday [10:45:03] sure [11:43:22] I need to fix my Generator [11:43:41] I’ll be back by 2 [11:46:21] lunch break [15:21:25] errand, back in a bit [15:50:52] \o [15:51:04] o/ [15:51:53] o/ [15:54:46] zpapierski: how do i get blazegraph to output the logs from the mw-oauth-proxy side of things? My local jetty says things like 'Started o.e.j.w.WebAppContext@4566e5bd{org.wikidata.query.rdf.mwoauth,...` and 5xx's go into the error log, but on wcqs* i can't get anything [15:55:29] i changed logback-wcqs-blazegraph.xml from but still no reference to mw-oauth [15:55:29] first question would be what are the actual logs there [15:55:32] let me look [15:56:17] I don't see any logs in the service itself [15:56:37] zpapierski: well, curl localhost:9999/oauth/check_auth, get a 5xx, there should be logs about that :P [15:56:49] then that's jetty, not wcqs [15:56:58] zpapierski: the command line runs both at the same time [15:57:02] I meant not mw-oauth proxy [15:57:05] I know [15:57:15] let me take a look at that logback config [15:57:28] to me, if it's on the same command line, its the same process :P The logback config is just `-Dlogback.configuration=....` suggesting it's not limited to blazegraph [15:57:31] any wcqs instance I assume [15:57:40] zpapierski: should all be the same, i'm using 1001 [15:57:43] that's generally true [15:57:49] but each has it's own appender [15:57:54] I might can have [16:06:54] ebernhardson: just to be sure - are you certain they are from mw-oauth-proxy? [16:07:22] the logs that are there are logged by jetty and they indeed don't come from mw-oauth-proxy [16:07:57] zpapierski: the problem is /oauth/check_auth gives a 5xx, and i want a log that says why it's 5xx [16:07:58] but I see no place where we'd set up 503 status explicitly (those wouldn't show in logs) and the errors in logs are logged by jetty [16:08:09] nginx? [16:08:15] no, its :9999 [16:08:18] thats jetty [16:08:21] ah, not nginx [16:08:48] wait a sec [16:08:49] i would expect some sort of debug logging to report all requests, good or bad, and how they got there [16:08:53] do we have something like that ? [16:08:53] there is something from mw-oauth [16:09:40] https://www.irccloud.com/pastebin/VU526dtk/ [16:09:58] mw-oauth hasn't even been registered [16:10:11] looks like some classpath issue [16:11:58] our logging is a mess :P There is journalctl for wcqs-blazegraph, and blazegraph specific logs on top. But somehow that ends up in syslog? :P [16:12:01] the logging is botched though, I found that one through journalctl (so stdout) [16:12:40] hmm, seems i should have found that yesterday. oh well. What is Janino? [16:12:50] no clue :) [16:13:10] :P [16:13:17] journald forwards to syslog, with some weird config to also send some of that to logstash (via rsyslog if I remember forrectly) [16:13:29] looks like piece of jetty though, I see nothing of ours in Jetty [16:13:36] I think that janino is the configuration engine of logback [16:13:44] or that [16:13:50] (I mean, I have no idea) [16:14:02] hmm, so do we have to compile logback into the oauth-proxy explicitly? [16:14:12] (and what changed, because that worked 3 weeks ago) [16:14:34] we need to have the logback jars in the oauth-proxy [16:15:04] janino might be an additional optional dependency, because we do some weird filtering in the blazegraph logback config [16:15:13] oh i remember, gehel and i tried to fix logging a few weeks ago then everything broke and i forgot to finish :) [16:15:35] ebernhardson: yeah, that's probably related in some way [16:15:57] I first stop coding in Java before logging in java stops being a mess [16:16:04] or die of old age (might be the same date) [16:16:12] s/I/I'll [16:16:24] logback-wdqs-blazegraph.xml has an section that probably requires janino [16:17:08] we add this dependency explicitly in building the blazegraph war: [16:17:11] https://www.irccloud.com/pastebin/n3AkA1K6/ [16:17:41] probably need the same in mw-auth, since we use that same weird logback config file [16:19:03] ok i can probably figure out how to get the deps moved over [16:19:34] traced back to T197645 [16:19:35] T197645: Make WDQS logs less verbose - https://phabricator.wikimedia.org/T197645 [16:21:13] honestly, those evaluators are a pretty convoluted way of making log less verbose. A clean up of those logback config files would be nice [16:23:44] i'm not really sure what else would do? I guess can look into options [16:24:26] I'm afraid that the alternative is fix Blazegraph to not log gazzillion of useless messages [16:25:32] so adding deps it is :) [16:50:48] weekend time! Have fun! [17:05:09] have a nice week-end! [18:19:45] * ebernhardson somehow never realized before that comments/reviews on gerrit patches are another per-changeid git branch inside the repo with a commit per comment [19:14:12] meh, mw-oauth-proxy still not happy due to things like [19:14:19] > — •ebernhardson somehow never realized before that comments/reviews on gerrit patches are another per-changeid git branch inside the repo with a commit per comment [19:14:20] oh wow [19:14:25] never knew that [19:15:08] That's actually brilliant, having it self contained in git. I've played around with using `git-notes` for similar purposes but manually done ofc [19:15:18] yea, looks like each patch has a .../meta branch with that info [19:17:27] * ebernhardson keeps pulling more deps into mw-oauth-proxy until it works [19:23:08] nope, i dont think that's going to work :S Of course, logback compiling `throwable.getCause() instanceof org.openrdf.query.MalformedQueryException` can't compile, because it doesn't exist [20:54:36] made wcqs1001 mostly work, patches coming up for a few things. Still have to figure out the nginx config, it looks like the auth request is matching the wrong location, but only for the readiness-probe which bypasses auth... [23:00:53] sigh....commonswiki_content and commonswiki_general on eqiad are missing `coordinates` in it's mapping (codfw is fine) [23:01:20] was looking into T296897 [23:01:20] T296897: Eqiad Geosearch API queries return errors on Commons - https://phabricator.wikimedia.org/T296897 [23:11:08] i guess restoring more indices, but monday. not today :P [23:11:59] * ebernhardson notes this kinda ties into previous discusions we've had about a way to compare expected schemas to the real thing and know what is unexpected