[07:18:30] good morning :) [09:05:09] \o [09:05:23] Currently typing up my interview notes for last week [09:24:28] o/ [09:24:39] me too! [09:31:38] :) [11:53:12] Oh and btw, Thursday is a public holiday here, and I'll take Friday off as well [12:02:14] Morning [12:03:24] \o [12:19:55] Back to early morning meetings for me [13:47:56] hello hello [13:51:30] klausman: o/ thanks for the review! [13:51:35] (https://gerrit.wikimedia.org/r/c/operations/puppet/+/793714) [13:51:45] I was unsure about the role's name etc.., but I guess we can change it as we go [14:09:27] re-+1'd [14:09:52] obrigado [14:10:02] I'll wait for Eric's suggestions etc.. before proceeding [14:10:08] ack [14:10:11] but the cluster should be ready to be bootstrapped in theory [14:10:37] and we could start with a simple keyspace for scores [14:11:04] one thing that may be nice is to save the score alongside with the model's version [14:11:41] it has pros/cons, but it is simple and reliable in my opinion [14:11:46] And expire old versions along a LRU scheme? [14:12:13] this may be difficult, I am not sure if there is any LRU algorithm in cassandra [14:12:48] maybe we could have a clean up job that periodically drop old things if not needed [14:16:34] I think we'd want the version number. People will ask for it. [14:18:20] I think it may also be great to have previous score/version tuples [14:19:02] so basically not considering it as a score cache, but a score datastore (maybe with a limited amount of history) [14:19:56] the amount of storage per node is not a lot (1.4T) [14:20:10] we have 4.2T / cluster basically [14:20:18] and things gets replicated three times [14:20:29] but it is easy to expand it [14:20:42] you mean as an output? [14:21:37] mmm sorry what do you mean with output? [14:21:59] (for the replication I meant internally, Cassandra replicates by default) [14:22:07] Ah got it. Now I understand [14:22:32] I mentioned the size of the disks because it may not be a lot for all our use cases, like online feature store and score datastore etc.. [14:23:07] but for the moment I'd use the cluster only for the score cache/datastore, and I'd see how it goes [14:23:26] so we'll get some familiarity with cassandra [14:23:33] Yeah that sounds right [14:34:54] klausman: I have just pushed a file called 'machine-learning' to pwstore, when you have a moment can you tell me if you can read the secret? [14:35:42] it is for https://wikitech.wikimedia.org/wiki/ORES/Deployment#Update_revscoring_in_PyPI [14:41:37] sec [14:43:12] Yes, I can read (and decrypt) it [14:44:17] \o/ [14:44:19] super [14:44:34] ok so documentation updated, now we should have all the info to publish revscoring versions etc.. [14:44:51] (I know that everybody is super happy about it) [14:45:03] Nice! [16:00:54] going afk folks, have a nice evening :)