[06:16:21] 06Machine-Learning-Team, 07OKR-Work, 13Patch-For-Review: Enable EmptyDir (/dev/shm) support for KServe InferenceServices to unblock NCCL-based tensor parallelism - https://phabricator.wikimedia.org/T421105#11752440 (10kevinbazira) 05Open→03Resolved a:03klausman [08:33:52] 06Machine-Learning-Team, 13Patch-For-Review: Experiment with new kserve version on stagin - https://phabricator.wikimedia.org/T419722#11752614 (10DPogorzelski-WMF) Latest knative supported by kserve 0.17 seems to require a more recent kubernetes version: ` {"severity":"EMERGENCY","timestamp":"2026-03-25T15:53:... [08:38:35] 06Machine-Learning-Team, 06Data-Platform-SRE, 10Prod-Kubernetes, 06ServiceOps new, and 2 others: Update kserve to v0.15.2* on ML clusters - https://phabricator.wikimedia.org/T380722#11752618 (10isarantopoulos) 05Open→03Declined I'm closing this task as it is outdated. There is work being done in {T... [08:39:17] 06Machine-Learning-Team, 06Data-Platform-SRE, 10Prod-Kubernetes, 06ServiceOps new, and 3 others: Update knative-serving+net-istio to v1.12.x on ML clusters - https://phabricator.wikimedia.org/T380723#11752625 (10isarantopoulos) 05Open→03Declined I'm closing this task as it is outdated. There is wor... [10:37:50] 06Machine-Learning-Team, 13Patch-For-Review: Experiment with new kserve version on stagin - https://phabricator.wikimedia.org/T419722#11752936 (10elukey) @DPogorzelski-WMF my understanding from reading upstream's commits is that they bump the k8s version every now and then and they set it to the version that i... [13:09:11] 06Machine-Learning-Team, 06Data-Platform-SRE, 06Infrastructure-Foundations, 10Prod-Kubernetes, and 3 others: Set cert-manager leader election namespace to cert-manager - https://phabricator.wikimedia.org/T383553#11753580 (10brouberol) This has now been done for dse-k8s-eqiad. [13:33:36] 06Machine-Learning-Team, 13Patch-For-Review: Experiment with new kserve version on stagin - https://phabricator.wikimedia.org/T419722#11753735 (10DPogorzelski-WMF) I will skip that for now as it's getting more complex than i initially anticipated. all services on staging work in the current setup and i'll ship... [15:40:59] (03CR) 10Ilias Sarantopoulos: [V:03+2 C:03+2] "LGTM, thanks Tricia!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1260048 (https://phabricator.wikimedia.org/T406369) (owner: 10Triciaburmeister) [15:42:20] (03Merged) 10jenkins-bot: ores-legacy: update doc links [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1260048 (https://phabricator.wikimedia.org/T406369) (owner: 10Triciaburmeister) [19:01:28] 06Machine-Learning-Team, 06Commons, 06Moderator-Tools-Team: Create a revert risk model for Wikimedia Commons - https://phabricator.wikimedia.org/T421425 (10GPSLeo) 03NEW