[06:22:55] hello [08:57:37] o/ [09:17:02] 10Machine-Learning-Team, 10Release-Engineering-Team: Github's wikimedia/ores not mirroring to Gerrit's scoring /ores /ores - https://phabricator.wikimedia.org/T311390 (10elukey) [09:17:09] lovely --^ [09:19:19] aiko: ok [09:19:21] err o/ [09:19:22] :) [09:19:39] I put some thinking on the mwapi/revscoring http async code [09:19:52] and it may no be very straighforward.. [09:19:57] so in my head, this is the chain [09:20:23] ORES code (or our kserve tornado event loop) -> revscoring (api extractor) -> mwapi [09:20:57] so if we add async/await/coroutine support to mwapi, that is relatively easy, then we'll need to decide where to put the event loop [09:21:17] because if we have it on revscoring (masking it from the rest), then we'd need some code related to that [09:21:36] if we delegate the eventloop to ORES/kserve, we'd need to change the ORES code as well [09:21:49] so in my head, the more we move towards the left the worse it gets :D [09:22:07] and we risk to compromise ORES' stability [09:22:28] I hoped for a more confined mwapi-only change, but it doesn't seem the case [09:22:31] does it make sense? [09:22:35] what are your thoughts? [09:23:02] (also others!) [09:43:16] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Send score to eventgate when requested - https://phabricator.wikimedia.org/T301878 (10elukey) @Ottomata thanks for all the insights! I like a lot the proposal for the page--score-change, +1 I chose to add an async kafk... [09:45:16] (03CR) 10AikoChou: outlink: use tornado async http client to fetch outlinks (033 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/807135 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [10:11:32] (03CR) 10Elukey: outlink: use tornado async http client to fetch outlinks (033 comments) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/807135 (https://phabricator.wikimedia.org/T311043) (owner: 10AikoChou) [10:21:28] aiko: the alternative that I have in mind is to just use the revscoring code as it is, tuning the number/grouping of pods [10:22:00] we could apply your research in ray workers and consolidate some small wikis, to reduce the footprint [10:22:17] or maybe we can just scale up the big ones and that's it [10:22:47] we know that the performances will not be good for the moment, until we create new and quicker models [13:00:20] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Send score to eventgate when requested - https://phabricator.wikimedia.org/T301878 (10Ottomata) > Use the current revision-score schema for the first version of the code, pushing to a test topic Sounds good! > I chose to add... [13:05:14] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Send score to eventgate when requested - https://phabricator.wikimedia.org/T301878 (10Ottomata) Example of what [[ https://gerrit.wikimedia.org/r/plugins/gitiles/eventgate-wikimedia/+/refs/heads/master/eventgate-wikimedia.js |... [13:31:53] I watched a python tutorial about asyncio.. https://www.youtube.com/watch?v=nFn4_nA_yk8 [13:31:57] recommend! [13:32:22] elukey: It does make sense about what you said [13:33:40] what I was thinking is the core of mwapi is this _request function https://github.com/mediawiki-utilities/python-mwapi/blob/37155d45e40065dbd562b7b9f35f332db6fff824/mwapi/session.py#L80 [13:33:48] If we can change it from requests to aiohttp, that would be nice [13:34:35] and I think we will need to change some codes in revscoring as well.. [13:34:48] https://github.com/wikimedia/revscoring/blob/9026e09e0c1069e0f62c20297aa8047b1161b06e/revscoring/extractors/api/extractor.py#L95 [13:35:53] In _extract_many function, it makes multiple calls at line 126, 152, 184 (if you trace the code, you will end up see self.session.get(…) [13:36:26] aiko: yeah I got the same impression, it looks though an invasive change to revscoring :( [13:37:01] we will need to put them in asyncio.gather(…) or something like [13:37:54] and also make sure that revscoring either runs an event loop, or whoever uses it knows that and eventloop is needed [13:38:01] (for example, ORES) [13:41:27] aiko: the more I think about it the more I am scared about this work, it is probably not worth it [13:51:10] elukey: yeah I feel the same :( [14:13:07] Morning all [14:15:40] morning! [14:26:02] I can't imagine people are using the feature injection. That sounds like more of a research feature, but there would be other ways to get that. [14:27:27] yep yep I think the same [14:30:06] chrisalbon: our last hope is that the feature injection workflow could be used to inject values (and avoid blocking calls to mwapi) [14:30:21] if this doesn't happen we'll probably have to live with not great perfs for those models [14:30:30] moving revscoring to async is a little crazy [14:30:48] (Aiko investigated too and we reached the same conclusion) [14:31:32] okay. I cannot imagine that somehow a nice-to-have researcher function like feature injection is somehow the key to high performance models [14:32:15] chrisalbon: the idea that we have is to get feature values via async http client (offered by kserve), and then pass them to revscoring [14:32:44] I think the solution is going to be a prediction cache [14:33:16] both could really help a lo [14:33:18] *lot [14:33:28] since these models will stick around for a while [14:33:59] i am just saying that we should try all possibilities, with time boxed tasks, and then find the best compromise [14:34:06] and avoid this mess in the future :D [14:36:25] (even with a prediction cache we'll be really slow in computing scores) [14:44:57] anyway, I am working on two fronts [14:45:07] 1) set up our cassandra clusters (almost done) [14:45:11] 2) reading https://www.mediawiki.org/wiki/Platform_Engineering_Team/Data_Value_Stream/Data_Gateway?tableofcontents=0 [14:45:26] this is the http service that we'll need to use for the prediction cache in theory [14:45:37] we should come up with a schema, propose it and see how it goes [14:47:15] even if in our case it is not really a published dataset [14:48:33] a streaming published dataset could probably fit, more or less what changeprop and ores are doing now [16:15:16] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Test async preprocess on kserve - https://phabricator.wikimedia.org/T309623 (10elukey) Me and Aiko investigated the possibility of moving `mwapi` to asyncio, but we concluded that it is not a viable road for us. Revscoring would need to... [16:41:00] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Set up the ml-cache clusters - https://phabricator.wikimedia.org/T302232 (10elukey) @lbowmaker hi! I reviewed https://www.mediawiki.org/wiki/Platform_Engineering_Team/Data_Value_Stream/Data_Gateway and I am wondering if the sc... [16:47:19] going afk folks :) [16:47:22] have a nice rest of the day [17:20:44] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Set up the ml-cache clusters - https://phabricator.wikimedia.org/T302232 (10lbowmaker) @elukey - in an ideal world we would like to abstract you from a lot of the underlying details of the data storage. For example, we are cur... [23:57:05] 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Set up the ml-cache clusters - https://phabricator.wikimedia.org/T302232 (10Eevans) >>! In T302232#8030200, @elukey wrote: > @lbowmaker hi! I reviewed https://www.mediawiki.org/wiki/Platform_Engineering_Team/Data_Value_Stream/...