[07:58:40] <addshore>	 Hi all, I just wrote https://phabricator.wikimedia.org/T287231 for moving WDQS munging into wikibase itself, and general RDF output flavours.
[07:58:46] <addshore>	 let me know if you have thoughts on this
[08:04:20] <zpapierski>	 not a fan, honestly :)
[08:04:37] <addshore>	 Would love to collect reasons :)
[08:05:01] <zpapierski>	 I have two main reasons for that
[08:05:53] <zpapierski>	 1) Munging is the process that's tied to the WDQS itself - basically it's the "canonical" model for Blazegraph
[08:06:19] <zpapierski>	 "canonical" is quoted, because of course it's hard to talk about actual model when it comes to RDF :)
[08:07:15] <zpapierski>	 which means that pushing it into wikibase makes it harder to modify, in times when we'll be actively pursuing new soutions for the query engine itself
[08:07:32] <addshore>	 Is it also useful for other triple stores, for the same reason that it is useful for blazegraph?
[08:09:04] <addshore>	 I can imagine a world in which wdqs itself continues to keep "munging" code in its own code base for further rapid changing of the wdqs specific flavour
[08:09:07] <zpapierski>	 2) It makes wikibase more to the monolithic architecture. Munging is a thing that needs to happen on the input date anyway, whether inside wikibase, or on the outside of it, but pushing it into wikibase automatically makes anything dependent on munging more coupled to it 
[08:09:20] <addshore>	 but also then the "stable" version of that flavour is upstreamed to use by others
[08:09:25] <zpapierski>	 I don't disagree with your argument
[08:09:48] <zpapierski>	 but I think we can achieve the same thing without pushing this into wikibase
[08:10:02] <zpapierski>	 if you think about it that way, sure
[08:10:16] <addshore>	 ooooh, interesting, I'd love to hear more about that and how you think that could work well!
[08:10:21] <zpapierski>	 I'm not sure how many use cases theere would be outside self deployments of wdqs though
[08:10:30] <addshore>	 the only way I can think if a microservice & also seperated script for munging dumps etc
[08:10:55] <addshore>	 right now in the wild i think there are the same number of wdqs instances as there are in production
[08:11:10] <addshore>	 and the number of people trying the same but with different backends continues to increase
[08:11:15] <zpapierski>	 but you're already hearing about it all the time - streaming updater will produce a public stream of events, anybody can subscribe to munged triples 
[08:11:36] <zpapierski>	 that last one is interesting - you know more about it?
[08:11:51] <zpapierski>	 maybe someone already fixed some of our issues :) ?
[08:11:58] <addshore>	 In theroy, how easy will the streaming updater be for non wmf folks to run?
[08:12:29] <addshore>	 Yeah, unfortunatly i dont have many details of success, just normaly folk running into issues :P
[08:12:55] <zpapierski>	 tough question, it requires elemental knowledge of Flink
[08:13:10] <zpapierski>	 but in most cases, they won't need to, stream will be public
[08:13:21] <zpapierski>	 (at least that's the idea)
[08:13:30] <addshore>	 well, not for non wikidata wikibases, we still need to cover the updating case there
[08:13:57] <addshore>	 and the question is, is a whole streaming updater the right solution for a wikibase that gets 10 edits a day
[08:14:01] <zpapierski>	 in any case, there needs to be an easy solution - WDQS will be coupled basically the updater - that's a good point
[08:14:51] <zpapierski>	 sure - but if somebody picks up WDQS for their small wikibase, how will you guarantee the compatibility of wikibase munging vs streaming updater processing?
[08:15:11] <zpapierski>	 like I said, imho we still need to make streaming updater easily deployable
[08:15:26] <addshore>	 thats a good question, but putting the munging in wikibase avoids having it in 2 places I figure
[08:15:43] <addshore>	 If that is the path that we have to take and streaming updater is not ideal for small wikibases
[08:15:52] <zpapierski>	 I see where you come from, I really do
[08:16:18] <zpapierski>	 it's just feel like strong coupling in times where it will hurt us with blazegraph transition
[08:16:35] <addshore>	 I'm going to copy this chat log into a comment in the ticket :) I think this was already a good chat, there is no real rush on this at all, just keeping it in thought etc!
[08:16:46] <zpapierski>	 ok :)
[08:19:37] <zpapierski>	 also, I think I'll investigate custom wiki set up - I think we spent too little time thinking about those, good thing that you are much better at it :)
[08:20:54] <RhinosF1>	 WikiBase was a nightmare to setup for our farm
[08:48:10] <gehel>	 looks like I'm not a member of the wikimedia organization on github yet. zpapierski you seem to be in, could you add me (username=gehel)
[09:25:44] <addshore>	 gehel: i just invited you
[09:26:20] <addshore>	 ill invite you to the "search" team too i guess
[09:26:57] <gehel>	 addshore: cool! and thanks!
[09:51:26] <gehel>	 Lunch
[11:43:00] <zpapierski>	 gehel: sorry, weirdly irccloud forgot to notify me on mention, but I see addshore delivered
[11:44:01] <zpapierski>	 lunch break
[13:49:48] <gehel>	 Trey314159: can you ping me when you're around?
[13:55:09] <zpapierski>	 relocating
[14:35:16] <elukey>	 ebernhardson: o/ I saw https://gerrit.wikimedia.org/r/c/operations/puppet/+/706740 about the ROCm package, if you need it we can create a task to upgrade the drivers to latest upstream
[14:35:28] <elukey>	 drivers + tools
[15:03:18] <ebernhardson>	 elukey: hi! It's not super important, i ran the same process on cpu's instead just using 500 cores instead of 1 gpu :) 
[15:04:22] <ebernhardson>	 i was thinking it would be nice to have onnxframework, since that's a somewhat common method of distributing pre-built models outside of training frameworks, but i hadn't noticed how recent the rocm integrations were
[15:09:06] <elukey>	 ebernhardson: on GPU nodes we now have a 5.10 kernel (hadoop and stat100x) so more up to date drivers, the rest of the packages should be relatively easy to test/upgrade (last famous words). I know that Miriam is doing final tests with Aiko for distributed tensorflow on hadoop, so may need to wait a little, but we could try to upgrade a single stat100x node and start testing
[15:09:24] <elukey>	 do you mind to open a task??
[15:09:28] <ebernhardson>	 elukey: sure
[20:57:39] <Trey314159>	 ryankemper or ebernhardson: you guys happen to know what happened to files on mwmaint1002? Everything I had there is gone now. I know I should use mwmaint2002, but I didn't think my files would get nuked.
[21:04:39] <Trey314159>	 nvm. looks like everything got backed up before being nuked, and I'm figuring out how to get some stuff restored.
[21:05:50] <ryankemper>	 Trey314159: well that's good to here. latest email about mwmaint1002 in the ops email is from june 29, `[Ops] Switchover: mwmaint1002.eqiad -> mwmaint2002.codfw`
[21:06:08] <ryankemper>	 email doesn't mention the server actually getting nuked or anything:
[21:06:23] <ryankemper>	 > Per the switchover today, [1] the mwmaint server is now passive in Eqiad and active in Codfw.
[21:06:23] <ryankemper>	 > An MOTD greeing has been enabled on mwmaint1002.eqiad.wmnet to remind you that no scripts are scheduled to run there.
[21:06:23] <ryankemper>	 > The Codfw equivalant is mwmaint2002.codfw.wmnet.
[21:06:59] <Trey314159>	 over in operations rzl says it was reimaged. Not clear to me why that was necessary.
[21:10:01] <RhinosF1>	 Trey314159: a reimage would probably be for a server upgrade
[21:10:10] <RhinosF1>	 Pretty sure it was planned
[21:10:14] <ryankemper>	 Trey314159: I was able to find context by searching the SAL for `mwmaint1002` and then seeing the ticket referenced as part of the re-image log
[21:10:17] <ryankemper>	 https://phabricator.wikimedia.org/T267607
[21:10:36] <ryankemper>	 The mwmaint servers apparently needed to be upgraded to debian buster, so they reimaged the eqiad one after the switchover to codfw...makes sense
[21:10:47] <RhinosF1>	 https://phabricator.wikimedia.org/T267607
[21:10:54] <ryankemper>	 yeah what RhinosF1 said :)
[21:12:02] <Trey314159>	 I guess reindexing is a weird use case (that's what I use that server for). I saw the MOTD that it shouldn't be used after the switchover, but not that it was going to be wiped.
[21:12:35] <RhinosF1>	 Yeah not sure if the reimage was communicated
[21:12:58] <Trey314159>	 anyway.. looks like I'm getting my backup restored. Thanks for the info. I'll make my own backups, too.
[21:14:07] <RhinosF1>	 Backups are normally taken or copies made on other servers
[21:32:34] <Trey314159>	 ryankemper: just FYI, my stuff can't be restored until Monday, so no reindexing for me today! Hopefully on Monday
[21:33:08] <ryankemper>	 Trey314159: ack, sounds good