[06:27:12] <wikibugs>	 (03CR) 10Hashar: [C: 04-1] Set up production and test images for the recommendation-api migration (035 comments) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/932810 (https://phabricator.wikimedia.org/T339890) (owner: 10Kevin Bazira)
[08:50:33] <wikibugs>	 10Machine-Learning-Team: Define SLI/SLO for Lift Wing - https://phabricator.wikimedia.org/T327620 (10elukey) >>! In T327620#9015842, @klausman wrote: > https://grafana.wikimedia.org/goto/x7S0HpjVk?orgId=1 I've started an SLO dahsboard here. It only has one metric (Latency) so far, but it's a start.  Please also...
[08:55:10] <wikibugs>	 (03PS44) 10Kevin Bazira: Set up production and test images for the recommendation-api migration [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/932810 (https://phabricator.wikimedia.org/T339890)
[09:05:50] <wikibugs>	 (03CR) 10Kevin Bazira: Set up production and test images for the recommendation-api migration (035 comments) [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/932810 (https://phabricator.wikimedia.org/T339890) (owner: 10Kevin Bazira)
[09:15:12] <wikibugs>	 10Machine-Learning-Team: Define SLI/SLO for Lift Wing - https://phabricator.wikimedia.org/T327620 (10klausman) I am woking on Grafana/Thanos directly for now because it's a shorter change-try loop to find the right metrics than doing it with Grizzly directly. Even with templating, we still need specific metrics...
[09:37:25] <isaranto>	 o/
[09:41:49] <isaranto>	 we'll need to either deploy simplewiki or point it to enwiki models (if that is what is being used). Until now I haven't found anywhere models for simplewiki, my assumption/conclusion is that it uses en models (although I dont think that would make much sense)
[09:48:07] <wikibugs>	 (03CR) 10Hashar: [C: 03+1] "Excellent Kevin :]" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/932810 (https://phabricator.wikimedia.org/T339890) (owner: 10Kevin Bazira)
[09:55:27] <wikibugs>	 (03CR) 10Kevin Bazira: [C: 03+2] "Great! Thanks a lot to everyone for the reviews :)" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/932810 (https://phabricator.wikimedia.org/T339890) (owner: 10Kevin Bazira)
[09:58:14] <elukey>	 isaranto: +1 to use the enwiki model again
[09:58:26] <wikibugs>	 (03Merged) 10jenkins-bot: Set up production and test images for the recommendation-api migration [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/932810 (https://phabricator.wikimedia.org/T339890) (owner: 10Kevin Bazira)
[09:58:42] <elukey>	 but do we need it? I mean, is it already used in the mw extension?
[10:07:36] <isaranto>	 yes, it is enabled in simplewiki https://simple.wikipedia.org/wiki/Special:RecentChanges?hidebots=1&hidecategorization=1&hideWikibase=1&limit=50&days=7&urlversion=2
[10:09:40] <wikibugs>	 10Machine-Learning-Team, 10Patch-For-Review: [ores-legacy] Clienterror is returned in some responses - https://phabricator.wikimedia.org/T341479 (10elukey) After some try I figured out what is the issue:  `   - name: inference     port: 6031     service: inference     timeout: "10s" `  The ten seconds are clea...
[10:10:12] <isaranto>	 actually one can see it here https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/ext-ORES.php#L36
[10:12:35] <wikibugs>	 10Machine-Learning-Team, 10Patch-For-Review: [ores-legacy] Clienterror is returned in some responses - https://phabricator.wikimedia.org/T341479 (10isarantopoulos) Ouch! nice catch, I couldnt figure it out. It makes sense because big responses may take even 20s...
[10:12:45] <elukey>	 isaranto: ok for the enwiki model then!
[10:12:48] <elukey>	 also, https://gerrit.wikimedia.org/r/c/operations/puppet/+/938815
[10:12:52] * elukey cries in a corner
[10:13:03] <elukey>	 spent an hour trying to figure out where the 10s timeout was
[10:13:11] <elukey>	 and I set it in puppet
[10:14:12] <elukey>	 isaranto: the tls proxy is the source of those text/plain 50x, we need to take it into account as well
[10:14:24] <isaranto>	 thanks Luca! saved the day!
[10:15:14] <elukey>	 let's see if it fixes, it is sad that we have to set a higher timeout, but it is also true that we send a lot of concurrent requests at once in the use case
[10:18:38] <elukey>	 there is also another bit in the puzzle to figure out
[10:18:50] <elukey>	 knative has its own way of load balancing, through the activator pods
[10:19:00] <elukey>	 https://knative.dev/docs/serving/autoscaling/concurrency/
[10:19:23] <elukey>	 kserve offers a way to set it, via "container_concurrency"
[10:19:25] <elukey>	 see https://github.com/kserve/kserve/blob/61c9bd334ae9766ffd1f2bf020764bf453cab54c/python/kserve/docs/V1beta1PredictorSpec.md?plain=1#L12C173-L12C229
[10:19:31] <elukey>	 that of course we don't set :D
[10:19:50] <elukey>	 and by default is like 100, so super high
[10:20:48] <elukey>	 so IIUC we risk overloading a pod since, by default, knative considers that it can handle 100 concurrent requests
[10:22:22] <elukey>	 mm but from https://github.com/kserve/kserve/issues/338 it seems that kserve has other defaults
[10:23:14] <elukey>	 https://github.com/kserve/kserve/blob/61c9bd334ae9766ffd1f2bf020764bf453cab54c/docs/samples/v1beta1/torchserve/autoscaling/README.md?plain=1#L22
[10:23:19] <elukey>	 some explanations in here
[10:25:57] <klausman>	 elukey: How would that interact with the replicas number?
[10:26:30] <klausman>	 1 concurrent req per replica seems to be the default, if I understand the issue/feature request correctly
[10:28:36] <elukey>	 not sure, I don't see the autoscaling.knative.dev/target annotation in any pod, it may be old
[10:29:36] <elukey>	 knative is responsible to increase/decrease the pods, with its autoscaler component, that IIUC uses "concurrency" as metric to judge how to scale up/down
[10:29:59] <elukey>	 the activator sits between the istio gateway and the pod, we can disable it as well
[10:30:21] <klausman>	 I see. I'm a bit ocnfused about .../metric vs .../target
[10:30:54] <klausman>	 Ah, target is the utilization goal
[10:32:24] <elukey>	 and there is also https://knative.dev/docs/serving/load-balancing/target-burst-capacity/#setting-the-target-burst-capacity
[10:32:28] <klausman>	 Would default annotation values be visible even if we don't set them?
[10:32:34] <elukey>	 the activator acts as load balancer and buffers requests, if needed
[10:33:04] <elukey>	 klausman: if you don't set anything the default is applied, in theory
[10:33:42] <klausman>	 I just wonder if that value (70) would be visible in however you queried the cluster/pods
[10:33:59] <elukey>	 70?
[10:34:06] <klausman>	 The target value
[10:34:13] <elukey>	 where did you see it?
[10:34:20] <klausman>	 https://knative.dev/docs/serving/autoscaling/concurrency/#target-utilization
[10:35:01] <klausman>	 dang, that's the per-container utilization target
[10:35:12] <klausman>	 ctrl-f led me astray once again
[10:35:15] <elukey>	 it is another thing, related though
[10:35:41] <klausman>	 The target burst capacity of 200 seems high for ORES/revscoring services as well
[10:35:44] <elukey>	 ah interesting, autoscaling.knative.dev/target is unlimited as default
[10:36:06] <elukey>	 even worse :D
[10:36:38] <klausman>	 So many performance-critical knobs to adjust
[10:37:45] <elukey>	 I think that we have two in this case:
[10:38:04] <elukey>	 - autoscaling.knative.dev/target (soft limit), that tells the autoscaler when to raise the number of instances etc..
[10:38:19] <elukey>	 - container_concurrency (hard limit), that indicates when to buffer requests
[10:38:41] <elukey>	 not super easy to find the best ranges for all our model servers
[10:38:55] <elukey>	 but it is something that we'll need to add/test before reaching production
[10:39:06] <klausman>	 especially with the still somewhat limited amount of traffic we've seen so far.
[10:40:39] <elukey>	 we have some data from Aiko's load tests, but we can start conservative and allow more pods to scale (if we have capacity)
[10:40:42] <elukey>	 then we tune
[10:41:07] <elukey>	 for RR we could use something like 5/10 in theory
[10:44:33] <elukey>	 or maybe 3/5 as starter
[10:44:47] <elukey>	 (well at least for ORES pods)
[10:48:47] * isaranto lunch
[11:09:30] * elukey lunch!
[11:31:51] * klausman lunch (and errand) as well
[12:46:08] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] ores-legacy: add error response for v1 requests (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/937960 (https://phabricator.wikimedia.org/T341486) (owner: 10Ilias Sarantopoulos)
[12:51:57] <elukey>	 klausman, isaranto - as FYI the tls-proxy settings for lift wing (https://gerrit.wikimedia.org/r/c/operations/puppet/+/938815) are set in puppet since they are used for the mw appservers as well
[12:52:23] <isaranto>	 ack
[12:52:23] <elukey>	 the change propagates to Lift Wing's ores-legacy pods since some helmfiles values are deployed on deploy1002
[12:52:32] <elukey>	 and we pick them up with a helmfile sync
[12:52:34] <elukey>	 (doing it now)
[12:53:34] <elukey>	 (please review it  when you have time so more people know etc..)
[12:53:53] <elukey>	 for clarity, the tls-proxy mentioned above is the one created by the "mesh" module that serviceops offers 
[13:05:09] <chrisalbon_>	 Good morning all!
[13:08:03] <elukey>	 morning!
[13:08:20] <elukey>	 isaranto: posted https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/938820/3
[13:08:25] <elukey>	 hopefully it will improve a bit
[13:08:38] <elukey>	 I still see the client error timeout for ores-legacy
[13:08:57] <isaranto>	 taking a look
[13:08:59] <elukey>	 but it kinda makes sense, the request is really huge
[13:14:11] <wikibugs>	 10Machine-Learning-Team, 10Patch-For-Review: [ores-legacy] Clienterror is returned in some responses - https://phabricator.wikimedia.org/T341479 (10elukey) The timeouts improved, but the original request (stated in the task's description) is still huge and leads to timeouts.
[13:36:34] <elukey>	 I also filed another change to add more scaling options to the various isvcs
[13:36:42] <elukey>	 in theory we should have enough capacity
[13:38:07] * elukey afk for a quick errand
[13:43:17] <isaranto>	 I added this https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/938856
[13:43:17] <isaranto>	 The idea is to not copy the same files in different directories in swift and also explicitly define that the server is using another model. Alternatively I was thinking that we could redirect requests to testwiki/simplewiki towards enwiki but I'm not sure how we would deal with host headers in that case
[13:46:50] <wikibugs>	 10Machine-Learning-Team: Define SLI/SLO for Lift Wing - https://phabricator.wikimedia.org/T327620 (10klausman) I made some progress on the experimental dashboard (https://grafana.wikimedia.org/goto/VSolQfj4k?orgId=1). Request count (and 200 vs non-200) I now have a better mental model/grasp of. The current setup...
[13:58:07] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: [C: 03+2] ores-legacy: add error response for v1 requests [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/937960 (https://phabricator.wikimedia.org/T341486) (owner: 10Ilias Sarantopoulos)
[14:01:03] <wikibugs>	 (03Merged) 10jenkins-bot: ores-legacy: add error response for v1 requests [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/937960 (https://phabricator.wikimedia.org/T341486) (owner: 10Ilias Sarantopoulos)
[14:09:35] <wikibugs>	 10Machine-Learning-Team, 10ops-codfw: ManagementSSHDown - https://phabricator.wikimedia.org/T341648 (10Jhancock.wm) 05Open→03Resolved replaced idrac card and coms battery. updated idrac IP info. BAT0002 alert has cleared and the server is reachable by ssh
[14:13:20] <klausman>	 elukey: I can confirm the alert for 2003 is gone from AM
[14:15:34] <elukey>	 klausman: ack, can you repool the node ?
[14:19:39] <klausman>	 will do
[14:21:44] <klausman>	 elukey: does it have to go inactive->no->yes or would riect work (I am going to go through "no" this time, to be sure, but I wondered)
[14:22:38] <elukey>	 you can go to yes directly
[14:22:47] <elukey>	 it is a flag in confd basically
[14:24:24] <klausman>	 Alright, ack. I know it's a bit more complex when building a new pybal service, but I wasn't sure in this case.
[14:25:03] <klausman>	 No alets firing with pooled=no, so seitched to =yes just now
[14:38:01] <wikibugs>	 10Machine-Learning-Team, 10Patch-For-Review: [ores-legacy] Clienterror is returned in some responses - https://phabricator.wikimedia.org/T341479 (10isarantopoulos) As I am checking now the issue has been resolved. However some of the underlying requests are getting errors related to the `mwapi` as state in  ht...
[14:39:15] <wikibugs>	 (03PS2) 10Ilias Sarantopoulos: ores-legacy: fix error due to response content type [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938266 (https://phabricator.wikimedia.org/T341479)
[14:43:09] <wikibugs>	 (03CR) 10Elukey: ores-legacy: fix error due to response content type (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938266 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[14:55:36] <isaranto>	 elukey: I was able to get a response for the big request for ores-legacy that we have in the task
[14:55:46] <isaranto>	 (actually tested a couple of times)
[15:03:31] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: ores-legacy: fix error due to response content type (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938266 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[15:06:38] <elukey>	 isaranto: I tried again as well, now works yes!
[15:07:15] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] ores-legacy: fix error due to response content type (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938266 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[15:07:23] <isaranto>	 it seems that if we get errors from mwapi this will happen again
[15:07:40] <isaranto>	 i'll deploy the new changes and we can run some more tests (load tests)
[15:08:11] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: [C: 03+2] ores-legacy: fix error due to response content type [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938266 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[15:09:03] <wikibugs>	 (03Merged) 10jenkins-bot: ores-legacy: fix error due to response content type [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938266 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[15:11:43] <isaranto>	 elukey: regarding autoscaling in knative. iiuc from the docs setting `autoscaling.knative.dev/target` to 3 would mean that an average of 3 replicas would be the target at any given time. correct?
[15:12:20] <elukey>	  isaranto: this is my understanding yes, it is a soft limit (mostly for the autoscaler)
[15:12:36] <elukey>	 the concurrency setting is a hard limit, after that the activator knative pods start queueing
[15:15:35] <isaranto>	 yes, however the concurrency setting refers to concurrent requests https://kserve.github.io/website/0.10/sdk_docs/docs/V1beta1ComponentExtensionSpec/ where in this setting concurrent_requests <> rqs (?)
[15:15:58] <isaranto>	 I am actually discussing/thinking out loud to understand better
[15:21:49] <elukey>	 yes yes definitely
[15:22:11] <elukey>	 I think concurrent requests reflects more how many clients we want at the same time for each model server
[15:22:42] <elukey>	 in this case, no more than 5 for each pod, and after that queueing
[15:22:48] <elukey>	 same thing for revscoring
[15:23:19] <elukey>	 with rps it may be more intuitive, I think that these metrics will need to be included in our tests in staging before moving to prod
[15:23:48] <elukey>	 thanks for the reviews, testing the values in staging :)
[15:24:32] <elukey>	 isaranto: I am still not 100% sure if this will solve our problems, but the current settings tend to overload a pod in my opinion
[15:28:37] <isaranto>	 also target utilization seems nice https://knative.dev/docs/serving/autoscaling/concurrency/#target-utilization
[15:30:00] <elukey>	 yep yep a lot of options, we need to start experimenting with those
[15:31:47] <elukey>	 starting with
[15:31:47] <elukey>	 unknown field "container_concurrency" in io.kserve.serving.v1beta1.InferenceService.spec.predictor
[15:31:50] <elukey>	 sigh
[15:33:08] <elukey>	 I think it was containerConcurrency
[15:33:29] <isaranto>	 all i found was this https://kserve.github.io/website/0.10/sdk_docs/docs/V1beta1ComponentExtensionSpec/
[15:33:47] <elukey>	 yes yes camelcase
[15:33:53] * elukey cries in a corner
[15:33:57] <elukey>	 sorry folks, fixing
[15:35:06] <isaranto>	 iiuc if you set it through kserve it is with underscore but if u set it directly on knative it is camelCase
[15:35:29] <isaranto>	 or it is just an error in the docs
[15:38:32] <elukey>	 I tested it now, we need to use the isvc spec, so camel case
[15:40:49] <isaranto>	 ack
[15:41:15] <isaranto>	 elukey: if you find some time let me know if you agree/disagree with this https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/938856
[15:41:41] <elukey>	 isaranto: 2 min and I'll check it!
[15:41:58] <isaranto>	 can even be tomorrow. I was just thinking if we should redirect requests of replicate model servers (with the downside of using more resources :( )
[15:43:44] * isaranto feels helpless and sad and angry about forest fires (once again)
[15:43:52] <isaranto>	 😄
[15:47:23] <elukey>	 :(
[15:47:52] <klausman>	 Yeah, we've had forest fire warnings for all of the last 8 weeks, but fortunately no major outbreaks
[15:52:03] <isaranto>	 things started getting ugly in Greece the last couple of days and today close to Athens
[15:53:05] <isaranto>	 https://ores-legacy.wikimedia.org/scores/enwiki
[15:53:05] <isaranto>	 https://ores-legacy.wikimedia.org/v1/scores/enwiki
[15:53:41] <isaranto>	 the message works, however I just thought a link to some docs would be even better
[15:53:57] <klausman>	 stay safe, Ilias, as much as it is feasible.
[15:54:47] <elukey>	 yep --^
[15:54:52] <elukey>	 isaranto: nice!
[15:54:56] <elukey>	 it is a good start
[15:55:18] <isaranto>	 buuuut I get another error, from the other patch which I should have caught
[15:55:43] <isaranto>	 `UnboundLocalError: local variable 'response_json' referenced before assignment`. On it!
[15:56:50] <elukey>	 ah snap
[15:56:54] <elukey>	 missed in the code review
[16:02:42] <isaranto>	 but still i'm not getting why it fails as I'm running it fine on statbox
[16:03:01] <isaranto>	 I mean why the underlying request fails.. 🤔
[16:05:37] <wikibugs>	 (03PS1) 10Ilias Sarantopoulos: ores-legacy: log error message instead of response_json [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938885 (https://phabricator.wikimedia.org/T341479)
[16:06:48] <wikibugs>	 (03CR) 10Elukey: [C: 03+1] ores-legacy: log error message instead of response_json [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938885 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[16:10:44] <wikibugs>	 (03CR) 10Ilias Sarantopoulos: [C: 03+2] ores-legacy: log error message instead of response_json [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938885 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[16:11:23] <isaranto>	 merging and deploying, this will not resolve the underlying issue but  just enable correct logging again
[16:11:35] <wikibugs>	 (03Merged) 10jenkins-bot: ores-legacy: log error message instead of response_json [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/938885 (https://phabricator.wikimedia.org/T341479) (owner: 10Ilias Sarantopoulos)
[16:14:43] <elukey>	 isaranto: reviewed, looks good, I just asked for a .fixture test so we see a diff etc..
[16:18:53] <elukey>	 going afk for today folks!
[16:19:00] <elukey>	 have a nice rest of the day
[16:20:14] <isaranto>	 ack, done!
[16:20:21] <isaranto>	 ciao Luca, cu tomorrow!
[16:29:36] <wikibugs>	 10Machine-Learning-Team, 10API Platform, 10Anti-Harassment, 10Content-Transform-Team, and 19 others: Migrate PipelineLib repos to GitLab - https://phabricator.wikimedia.org/T332953 (10TBurmeister)
[16:31:58] <isaranto>	 ok I fixed ores-legacy but there are a lot transient errors related to mediawiki api coming from lift wing
[16:38:54] <wikibugs>	 10Machine-Learning-Team: Define SLI/SLO for Lift Wing - https://phabricator.wikimedia.org/T327620 (10klausman) I've now also managed to add some letncy bucketing stuff. Not 100% yet if it is what we want, but in any case, it's progress.
[16:39:08] <klausman>	 Now heading out as well \o
[16:42:58] <wikibugs>	 10Machine-Learning-Team: [ores-legacy] Clienterror is returned in some responses - https://phabricator.wikimedia.org/T341479 (10isarantopoulos) The issue with the request mentioned in the task description is not always happening as sometimes we get a response with no errors. However, now that we correctly read r...
[16:44:48] <isaranto>	 buy Tobias! heading out as well! o/
[18:32:18] <wikibugs>	 10Machine-Learning-Team, 10ORES, 10MW-1.41-notes (1.41.0-wmf.11; 2023-05-30), 10Wikimedia-production-error: PHP Notice: Trying to access array offset on value of type null (in SpecialORESModels) - https://phabricator.wikimedia.org/T329304 (10Umherirrender) 05Open→03Resolved
[19:23:46] <wikibugs>	 10Machine-Learning-Team, 10Goal: Goal: Lift Wing announced at MVP to the public - https://phabricator.wikimedia.org/T341703 (10calbon)
[19:23:55] <wikibugs>	 10Machine-Learning-Team, 10Goal: Goal: Zero traffic on bare metal ORES servers - https://phabricator.wikimedia.org/T341696 (10calbon)
[19:24:06] <wikibugs>	 10Machine-Learning-Team, 10Goal: Goal: Defined and measured SLO for every production service - https://phabricator.wikimedia.org/T341693 (10calbon)
[19:24:14] <wikibugs>	 10Machine-Learning-Team, 10Goal: Goal: Support WME migration to Lift Wing - https://phabricator.wikimedia.org/T341698 (10calbon)
[19:24:24] <wikibugs>	 10Machine-Learning-Team, 10Goal: Goal: Content Recommendation API migration completed - https://phabricator.wikimedia.org/T341704 (10calbon)
[19:24:30] <wikibugs>	 10Machine-Learning-Team, 10Goal: Goal: Order 2-4 GPU for Lift Wing and Statbox - https://phabricator.wikimedia.org/T341699 (10calbon)
[19:24:41] <wikibugs>	 10Machine-Learning-Team, 10Goal: Stretch Goal: Swagger UI implemented for every production inference service - https://phabricator.wikimedia.org/T341701 (10calbon)
[19:24:51] <wikibugs>	 10Machine-Learning-Team, 10Goal: Stretch Goal: Inference batching is tested to our satisfaction - https://phabricator.wikimedia.org/T341702 (10calbon)
[19:25:02] <wikibugs>	 10Machine-Learning-Team, 10Goal: Stretch Goal: Hosting a production ready version of an LLM - https://phabricator.wikimedia.org/T341695 (10calbon)
[19:27:53] <wikibugs>	 10Machine-Learning-Team, 10Goal: Lift Wing announced at MVP to the public - https://phabricator.wikimedia.org/T341703 (10calbon)
[19:27:59] <wikibugs>	 10Machine-Learning-Team, 10Goal: Zero traffic on bare metal ORES servers - https://phabricator.wikimedia.org/T341696 (10calbon)
[19:28:04] <wikibugs>	 10Machine-Learning-Team, 10Goal: Defined and measured SLO for every production service - https://phabricator.wikimedia.org/T341693 (10calbon)
[19:28:11] <wikibugs>	 10Machine-Learning-Team, 10Goal: Content Recommendation API migration completed - https://phabricator.wikimedia.org/T341704 (10calbon)
[19:28:18] <wikibugs>	 10Machine-Learning-Team, 10Goal: Support WME migration to Lift Wing - https://phabricator.wikimedia.org/T341698 (10calbon)
[19:28:22] <wikibugs>	 10Machine-Learning-Team, 10Goal: Order 2-4 GPU for Lift Wing and Statbox - https://phabricator.wikimedia.org/T341699 (10calbon)
[19:28:29] <wikibugs>	 10Machine-Learning-Team, 10Goal: Stretch: Swagger UI implemented for every production inference service - https://phabricator.wikimedia.org/T341701 (10calbon)
[19:28:35] <wikibugs>	 10Machine-Learning-Team, 10Goal: Stretch: Inference batching is tested to our satisfaction - https://phabricator.wikimedia.org/T341702 (10calbon)
[19:28:50] <wikibugs>	 10Machine-Learning-Team, 10Goal: Stretch: Hosting a production ready version of an LLM - https://phabricator.wikimedia.org/T341695 (10calbon)