[07:10:08] cbogen_: sorry, just saw this message now. I'm not sure I have enough detail to personally lead that meeting, but it sounds like a it might be a good idea to at least establish what might be duplicative work [08:25:32] ejoseph: let me know when you're available - I myself will not be unfortunately between 10AM-1PM (dietitian + need to prepare my garage for the delivery of home appliances) [08:33:10] cbogen_: a thread was started on slack already so I think it's fine, thanks for the offer! [09:01:26] errand [09:12:34] i am available now [09:33:45] zpapierski [11:40:13] I realized emmanuael hasn't been added yet to the template on https://www.mediawiki.org/wiki/Wikimedia_Search_Platform, and I actually have no idea how to edit the information in that box [11:45:34] mpham: added, it's a param to the template (https://www.mediawiki.org/w/index.php?title=Wikimedia_Search_Platform&diff=4952578&oldid=4933347) [11:46:23] merci! [11:49:54] Không có gì! [11:49:58] (interesting. i had been in visual editor, where it didn't look like i could change it, but switching to source editor makes it look pretty straightforward) [11:50:26] yes sometimes VE makes this kind of edits non trivial [11:51:51] lunch + errand [12:35:31] hmm, I'm timing out on meta.wikimedia.org [12:35:50] maybe this requires some additional proxy setttings? [12:36:32] zpapierski: Over the internet? [12:36:35] Can you ping it? [12:36:44] on hadoop [12:36:57] it's probably proxy I'm missing [12:37:16] otoh ping works fine on stat1004 [12:37:59] but it's not working from another server? (sry I'm not familiar with where we run hadoop) [12:38:09] but https doesn't, which makes me think it's a proxy stuff even more [12:38:25] there were some routing changes in the past few hours, so I'm just anxious to rule out anything on that side. [12:39:02] not necessarily related - I've just started testing new workflow, and I haven't done that for some time [12:46:43] mpham: we discussed the revised plan for WCQS deployment - since we won't make it in time to have a full workflow ready before the break. Beginning of February, sounds feasible for us - that's a 2 week delay, so you can hold off the comms until the new year, but that's up to you [12:50:54] zpapierski: I ran a bunch more tests there and I don't believe it's a network issue, probably something at a higher level as you said. [12:51:17] if you get the sense it's not please ping me. thanks. [12:51:27] sure, thanks! [12:52:16] ok, confirmed, it's proxy [12:52:28] I thought we already have some default config there, but apparently not [14:00:27] late lunch [15:38:34] zpapierski: how are you accessing? you should be doing it by accessing https://api-ro.discovery.wmnet and setting Host header to meta.wikimedia.org [15:38:36] then you don't need a proxy [15:39:30] ah, ok - I'm probably not doing that [16:48:45] * ebernhardson wonders how crazy it would be for internal dns to just return api-ro.discovery.wmnet [16:48:49] via cname or whatever [17:31:14] is there a reason wikidata and lexeme data reloads are tied together, even though they come from different source inputs (reload_wikidata in sre.wdqs.data-reload)? I suppose it can be special cased either way, but wcqs integration seems more natural if lexeme reloads are their own thing [17:32:29] ebernhardson: we should not reload wikidata without lexemes so it's why they're tied together I think [17:32:33] i suppose it's because they live in the same .jnl file? [17:33:03] mainly because once the reload is done the data_loaded file flag is touched and that's a trigger for the updater to start [17:33:44] ahh, ok yea that makes sense [17:34:07] everything here is special cases but i didn't want to write special reload wcqs variants of the same :P [17:35:52] "special cases" is completely normal in this codebase :) [17:37:33] I guess the fact that we use the RDF format for all these dataset sounds like it could be the same but they're 3 different extensions (wikibase, lexemes, mediainfo) [17:37:39] i'm sure i could generalize this out to something that handles it more directly...but not clearly worthwhile [17:37:58] we could do a lot better I'm sure [17:38:00] right, but the code is the same too. Curl from here, run loadData.sh [17:38:08] true [17:38:17] the variance is what to run where [17:39:09] it's also very common to preload the dumps manually [17:39:48] because here it's taking the latest but now we need to sync that with the dump we took to bootstrap flink [17:40:31] right, thats the handling around allowing the file to already exist. I find it dubious that it treats file exists as "100% complete and accurate download", but for another day :P [17:40:45] (i've had problems with other things where a 0 byte file emit on failure breaks things) [17:41:11] oh yes this dump process is *very* fragile [17:41:39] it's not like it can take more than one week to proceed so that's fine :P [17:41:44] lol [17:42:46] i guess i should review other cookbooks and see what the tendency of duplication is here...if everyone expects to copy something and change 4 lines i guess so be it [17:43:02] (my current generalization doesn't make anything simpler :P) [17:43:12] :) [18:14:36] errand/dinner