[08:19:20] gehel: would you have couple mins for a chat? https://meet.google.com/rhz-cswz-mie [08:32:42] I'll be there in 5' [08:32:53] thanks! [10:04:06] lunch [10:06:03] lunch 2 [12:58:42] greetings [13:10:28] are the new wdqs nodes still broken? [13:10:37] o/ [13:10:42] o/ [13:11:12] inflatador: they were rebooted this morning but haven't had a closer look since then [13:11:58] quickly looked at syslogs but it's all polluted with failed sparql queries [13:12:01] dcausse welcome back! I'll take a look, I brought those online via the cookbook a couple of weeks ago [13:12:12] thanks :) [13:12:51] Kinda strange that all 3 would break at the same time, I wonder if they were never in working condition? [13:15:14] moritzm I see you wrote "unresponsive via botched wdqs-categories process" in operations, does that mean we didn't bring it online properly or just that it was broken? [13:17:37] 1013 was totally unresponsive over SSH and mgmt, after I had powercycled it, I moved on to 1014/1015, but they were accessibke via SSH again [13:18:02] maybe whatever was restarted as part of the 1013 powercycle, restored it for 1014/1015 [13:18:23] but it's still spamming the console on 1014/1015 [13:18:44] if you log into them via the serial console you'll see the logspam on the tty [13:19:03] maybe it needs a restart of the blazegraph processes to resolve this part, not sure [13:19:46] Ah, thanks...I have a feeling I didn't import the wdqs-categories data when I brought those online [13:20:11] should be able to fix that shortly. Thanks for helping out [13:24:53] dcausse when I bring a wdqs host online, do I need to run the data transfer for all instances? ref https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/wdqs/data-transfer.py#L28 [13:26:12] inflatador: for "wdqs" I think you need to run the "categories" and "wikidata" transfer [13:27:05] I think we used to have an "all" that did this but IIRC it was removed because too confusing now that we have commons [13:27:15] dcausse thanks, that could be the problem then. categories is ~30G on 1004 and ~200M on 2014...loading the data from wdqs1011 now [13:27:31] thanks! [13:30:03] this is documented at https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service#Data_reload_procedure , just something I missed ;( [13:49:36] dcausse OK, categories is synced on wdqs[1014-1016] and downtime removed. I'll be keeping an eye out but LMK if you notice anything [13:49:52] sure, thanks! [14:02:19] dcausse: do you have like 10mins [14:02:27] ejoseph: sure [14:03:57] https://meet.google.com/eki-rafx-cxi [14:18:50] aannnnd back [14:35:52] ryankemper looks like we are double-booked for Search Update Pipeline and SRE mtg, any preference on which one you'd like to attend? [14:53:24] \o [14:53:33] o/ [15:01:54] triage starting: https://meet.google.com/eki-rafx-cxi (ebernhardson, ryankemper) [15:02:31] inflatador: prob rather SRE mtg [15:03:34] ryankemper ACK, I'll do the pipeline then [15:48:09] * ebernhardson needs to find a way to stop forgetting to re-enable puppet [16:48:57] dinner [16:55:20] wow, someone just submitted a PR to a project i wrote before joining wiki [16:56:32] last patch i wrote myself dated oct 2012 :) [17:45:25] ◉_◉ [17:45:34] Lunch, back in ~45-1h [18:29:51] gehel: finishing up some food, few mins [18:29:57] ack [18:30:32] back [18:34:18] oops. Forgot I need to pick up an order, back in ~15-20 [18:51:35] OK, really back [21:12:46] * ebernhardson is suspicious of writing fairly meh code all morning, then testing it in a browser and seeing it work. Clearly missed something :P [21:24:50] school run, back in ~45 [22:02:09] back [22:20:20] relforge ES7 update per ryankemper and myself: https://phabricator.wikimedia.org/T315604#8176256