[06:28:35] <elukey>	 good morning :)
[06:37:40] <wikibugs>	 10Machine-Learning-Team, 10ORES: ORES server does not start due to flask dependency conflicts - https://phabricator.wikimedia.org/T309862 (10elukey) Hi! Thanks a lot for your interest in the ML team :)  ORES is our current platform but we are building a new one called "Lift Wing", that will be entirely on Kube...
[06:41:25] <elukey>	 very nice pull request - https://github.com/wikimedia/ores/pull/359
[06:41:34] <elukey>	 added Aiko and Kevin to it --^
[06:43:23] <wikibugs>	 10Machine-Learning-Team, 10ORES: ORES gives internal error on an invalid model_info parameter - https://phabricator.wikimedia.org/T279271 (10elukey) Thanks a lot for the pull request! My team is going to review it and get back to you :)
[06:52:33] <elukey>	 kevinbazira: o/ I merged your changes for articlequality isvcs
[06:53:11] <kevinbazira>	 thanks for the review elukey. will deploy soon.
[07:54:17] <elukey>	 ah lovely, I am adding the code to send a revision score event to eventgate in our model.pys
[07:54:34] <elukey>	 and I was convinced that we fetched content from the mw api directly
[07:54:44] <elukey>	 but no, we use revscoring
[07:54:45] <elukey>	 sigh
[07:56:56] <elukey>	 so the main issue is that to construct a revision-score event, I'd need rev-id metadata
[07:57:06] <elukey>	 that afaics are not available from revscoring
[07:57:16] <elukey>	 so the quick solution would be to make another HTTP call to the mw api
[07:57:21] <elukey>	 increasing the latency
[08:01:05] <elukey>	 also, using the async mwapi stuff may not be straightforward
[08:17:01] <elukey>	 the ideal scenario would be to
[08:17:11] <elukey>	 1) get the mw content via async api
[08:17:19] <elukey>	 2) pass it to revscoring for feature extraction etc..
[08:17:27] <elukey>	 3) use metadata to create the revision-score event
[08:26:27] <elukey>	 but from what I can see it seems very difficult to achieve the result
[08:47:07] <elukey>	 ok I discovered something interesting
[08:47:33] <elukey>	 mmm no nevermind
[08:51:06] <aiko>	 good morning folks :)
[08:51:28] <elukey>	 hello aiko :)
[08:58:54] <elukey>	 aiko: one question from the editquality model.py code
[08:59:01] <elukey>	 (just to understand if I got it correctly)
[08:59:25] <elukey>	 when we use the revscoring extractor, behind the scenes it makes a call to the mw api
[09:00:22] <elukey>	 from what I can see, we do
[09:00:32] <elukey>	 1) self.extractor.extract(rev_id, self.model.features) to get the list of features for the base use case
[09:00:46] <elukey>	 2) if extended_output is true, we do the same call but with a trimemd list of features
[09:01:00] <elukey>	 so, IIUC,  in the latter we call the mw api twice
[09:01:05] <elukey>	 is it the right understanding?
[09:02:33] <elukey>	 I am trying to see if we can call the mw api once (via async/await), and then instruct revscoring to just use that content
[09:14:43] <aiko>	 elukey: yes, that's correct, in the latter case we call mw api twice.
[09:15:14] <aiko>	 not sure if we can add base features list and trimmed features list together and call self.extractor.extract only once
[09:18:06] <aiko>	 https://github.com/wikimedia/revscoring/blob/master/revscoring/extractors/api/extractor.py#L58 seems it is a custom class ~revscoring.dependents.dependent.Dependent
[09:30:43] <elukey>	 aiko: yeah I am wondering if we could use the same list without trimming, not sure if it is needed or not
[09:31:10] <elukey>	 the other thing that I am wondering is if we could leverage the "cache" field of the extractor to pass what it is retrieved by the mw api
[09:33:16] <elukey>	 something like
[09:33:38] <elukey>	 1) we get the rev-id content/metadata from the mw api from a regular HTTP call, not from revscoring
[09:33:47] <elukey>	 2) we create the cache extractor parameter
[09:34:07] <elukey>	 3) we pass it to the revscoring extractor function, that hopefully will use it (without calling the mwapi)
[09:34:30] <elukey>	 4) data in 1) could be re-used to create the mediawiki-revisionscore event as well
[09:50:51] <wikibugs>	 10Machine-Learning-Team, 10ORES: ORES server does not start due to flask dependency conflicts - https://phabricator.wikimedia.org/T309862 (10Gethan) Thanks for all the details on Lift Wing. I will go through it.  Wish you all the best for the migration.  I may review a few more tasks in ORES or revscoring to m...
[09:51:01] <elukey>	 I found something like
[09:51:02] <elukey>	 values = solve(features, cache={revision.text: "I think it is stupid."})
[09:57:04] <elukey>	             caches : `dict`
[09:57:04] <elukey>	                 A rev_id-->cache pairs of call-specific pre-computed values to
[09:57:07] <elukey>	                 inject
[09:57:13] <elukey>	 this is the pydoc for the extractor
[09:57:29] <elukey>	 it doesn't really explain how the cache should look like
[09:57:41] <elukey>	 but IIUC it should be something that replaces the mw api call
[09:58:09] <elukey>	 so, if I am right, we could even have a separate transfomer that calls the mw api and that encodes its result into a cache dict
[09:58:20] <elukey>	 it would simplify our life a lot
[09:58:23] <elukey>	 does it make any sense?
[10:22:12] <aiko>	 yep, it makes sense. that would be nice if we can use the cache field
[10:31:19] <elukey>	 I wish there was an example
[10:38:54] * elukey lunch!
[13:08:34] <aiko>	 elukey: one question: you mentioned we currently have some restrictions for memory/cpu in production. I wonder what are they specifically? how many cpu and how large memory for each pod?
[13:11:41] <elukey>	 aiko: so in k8s we don't assign cpus directly, see https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
[13:12:19] <elukey>	 I am still not 100% familiar with how cpus are assigned to pods, but basically it is cpu time (shared among pods)
[13:12:32] <elukey>	 there is a request (when the pod is created) and a limit (maximum amount )
[13:12:49] <elukey>	 IIRC we use kserve's defaults, lemme see
[13:15:25] <elukey>	 so the kserve container 
[13:15:28] <elukey>	     Limits:
[13:15:28] <elukey>	       cpu:     1
[13:15:28] <elukey>	       memory:  2Gi
[13:15:28] <elukey>	     Requests:
[13:15:28] <elukey>	       cpu:     1
[13:15:30] <elukey>	       memory:  2Gi
[13:15:33] <elukey>	 aiko: --^
[13:15:43] <elukey>	 these can be changed in case
[13:17:19] <aiko>	 elukey: I see.. thanks! :)
[14:52:16] <kevinbazira>	 starting deployment for svwiki & trwiki articlequality isvcs
[14:53:24] <elukey>	 ack!
[14:53:36] <elukey>	 the work on revscoring's cache for the extractor is a bit of a mess
[14:53:38] <elukey>	 sigh
[15:00:07] <kevinbazira>	 yep ... we are bound to run into hairy challenges as we work towards feature parity
[15:00:10] <kevinbazira>	 both eqiad and codfw deployments have been completed successfully.
[15:00:24] <elukey>	 super
[15:00:42] <kevinbazira>	 checking pods now ...
[15:00:49] <elukey>	 kevinbazira: the main issue with revscoring atm is that we can't use http async conns to the mw api
[15:00:57] <elukey>	 unless we change something in it
[15:03:07] <elukey>	 the code that runs on the pod now is totally blocking, and it doesn't play well with the kserve architecture
[15:06:35] <wikibugs>	 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Create the ml-serve-staging k8s cluster - https://phabricator.wikimedia.org/T302195 (10elukey) Completed the basic networking work (calico, eventrouter, coredns) + BGP config.  Next step: https://wikitech.wikimedia.org/wiki/Ku...
[15:08:05] <kevinbazira>	 OMW ... we'll find ourselves changing so many little things on revscoring. 
[15:08:05] <kevinbazira>	 all new pods are up and running. 
[15:08:05] <kevinbazira>	 NAME                                                              READY   STATUS    RESTARTS   AGE
[15:08:05] <kevinbazira>	 svwiki-articlequality-predictor-default-k8rwc-deployment-5rgfp4   3/3     Running   0          5m52s
[15:08:05] <kevinbazira>	 trwiki-articlequality-predictor-default-lkmb5-deployment-5jg9vn   3/3     Running   0          5m50s
[15:08:15] <wikibugs>	 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Test async preprocess on kserve - https://phabricator.wikimedia.org/T309623 (10elukey)
[15:12:00] <wikibugs>	 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Test async preprocess on kserve - https://phabricator.wikimedia.org/T309623 (10elukey) While working on another task, I had to check some details of how revscoring is currently handling http connections to the mw api. For T301878 it would be nice to make a s...
[15:12:08] <elukey>	 kevinbazira: yeah but this one seems very big, I added a note to the related task
[15:16:12] <wikibugs>	 10Lift-Wing, 10Epic, 10Machine-Learning-Team (Active Tasks): Send score to eventgate when requested - https://phabricator.wikimedia.org/T301878 (10elukey) @Ottomata Hi! I am getting back to this task, and after some preliminary checks it seems that calling the eventgate endpoint is the best way to go with ks...
[15:22:32] <elukey>	 all right logging off for today o/
[15:22:33] <elukey>	 away afk!