[08:06:49] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Editquality Transformer - https://phabricator.wikimedia.org/T298943 (10elukey) @hashar do we need to reload CI or similar for https://gerrit.wikimedia.org/r/c/machinelearning/liftwing/inference-services/+/755954 ? I see the new job name... [08:23:06] 10Machine-Learning-Team, 10CFSSL-PKI, 10Infrastructure-Foundations, 10serviceops, 10Patch-For-Review: Extend cfssl-issuer to return the Root CA certificate - https://phabricator.wikimedia.org/T299906 (10JMeybohm) 05Open→03Resolved Updated cfssl-issuer is deployed to all clusters where it is currently... [08:46:17] * elukey afk for a bit! bbl [10:11:44] * elukey back [11:31:58] * elukey lunch! [12:55:21] new ORES dashboard: https://logstash.wikimedia.org/app/dashboards#/view/ORES [12:55:36] it seems containing more info, lemme know if you like it [14:48:57] I am reading https://tech.olx.com/demystifying-istio-circuit-breaking-27a69cac2ce4 and with egress gw it seems very easy to add constraints [14:49:18] we can vary for destination (like mw api etc..) [14:49:21] really nice [14:53:21] Would this also be an option for ingress? Like the abusive-bot case./ [14:56:06] I think it is more for DestinationRules, so istio gw -> something (pods, endpoints, etc..) [14:56:23] in theory the destination rules on ingress for the ml pods are added by knative [14:56:41] there is surely a way to limit incoming traffic though [14:57:56] Ack [14:59:21] do you want to take a look at it as part of the api-gw integration task? [15:05:53] klausman: --^ [15:06:19] It's definitely something that either the API GW or our ingress will have to do, so yes [15:06:27] (possibly both) [15:13:37] makes sense, let's open a task to track it [15:21:46] done [15:21:48] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks): Explore ingress filtering for Lift Wing - https://phabricator.wikimedia.org/T300259 (10klausman) [15:51:12] o/ [15:51:41] o/ [15:54:13] https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/757675 should be enough to change the egress gw's settings [15:56:56] 10Lift-Wing, 10Machine-Learning-Team (Active Tasks), 10Patch-For-Review: Add an envoy proxy sidecar to Kserve inference pods - https://phabricator.wikimedia.org/T294414 (10elukey) Really nice document to read: https://tech.olx.com/demystifying-istio-circuit-breaking-27a69cac2ce4 [15:58:57] accraze: https://logstash.wikimedia.org/app/dashboards#/view/ORES [15:59:52] we can now filter for UA/URI/etc.. [15:59:58] and there are breakdowns [16:00:03] woooow!! [16:00:24] you were right, most calls are from changeprop [16:01:04] a lot of UAs are generic though, and not indicating what they are doing [16:01:04] nice work elukey! [16:01:07] <3 [16:01:26] without change-prop the traffic would decrease a lot [16:01:33] but of course we wouldn't have precache [16:01:43] good point [16:05:04] I am still not entirely convinced that we need it [16:05:10] we'll see [16:05:46] I have updated the ORES logging PR with Aaron's suggestions [16:06:04] after that we should have more info about what ORES does (in theory) [16:34:56] forgot to mention, since the mirroring is broken from github -> gerrit, we'll need to manually mirror all the model repos before doing the next ORES deploy [16:35:46] (code changes + lfs) [16:36:11] i can't wait not to use git lfs lol [16:36:17] uffff [16:36:46] i'm gonna work on it today and see if i can mirror articlequality to get the new nlwiki model on gerrit [16:39:05] thanks! [16:41:24] i think we just need to do `git push gerrit master` [16:41:48] and then `git lfs push gerrit master` [16:42:03] for each of the model repos [16:43:22] * elukey nods [16:49:39] 10Lift-Wing, 10Machine-Learning-Team: Return meaningful HTTP responses in Lift Wing's revscoring backends - https://phabricator.wikimedia.org/T300270 (10elukey) [16:49:57] * elukey afk for a bit [17:08:45] 10Lift-Wing, 10Machine-Learning-Team: Return meaningful HTTP responses in Lift Wing's revscoring backends - https://phabricator.wikimedia.org/T300270 (10ACraze) It looks like we can do this using `tornado.web.HTTPError` and add the status code and message, similar to how it's done here: https://github.com/kser... [17:59:36] elukey that UA/URI info is amazing [18:03:21] chrisalbon: \o/ [18:11:37] going afk, have a nice (rest-of-the) day folks! [18:12:44] Have a great evening! [18:13:06] see ya elukey [20:30:53] 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): ORES deployment - Winter 2022 - nlwiki articlequality model - https://phabricator.wikimedia.org/T300195 (10ACraze) [20:35:40] 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): ORES deployment - Winter 2022 - nlwiki articlequality model - https://phabricator.wikimedia.org/T300195 (10ACraze) 05Open→03In progress Starting the initial work to deploy the article quality model for... [20:49:30] cool, i was able to get the new nlwiki articlequality mirrored over to gerrit [20:49:42] just need to make sure the other repos are still in sync now [21:45:59] all model repos are all in sync, should be able to build the deploy repo now [21:51:20] 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): ORES deployment - Winter 2022 - nlwiki articlequality model - https://phabricator.wikimedia.org/T300195 (10ACraze) Confirming the model repos have all been manually mirrored. The nlwiki articlequality mod... [22:18:52] 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): ORES deployment - Winter 2022 - nlwiki articlequality model - https://phabricator.wikimedia.org/T300195 (10Halfak) Great! I'll get everything updated in the deploy patchset and ready for you tomorrow. O... [22:28:48] 10ORES, 10artificial-intelligence, 10articlequality-modeling, 10Machine-Learning-Team (Active Tasks): ORES deployment - Winter 2022 - nlwiki articlequality model - https://phabricator.wikimedia.org/T300195 (10ACraze) I think there is still some more work to be done for observability, so maybe leave it for...