[07:27:56] hello! [08:42:32] Morning! [08:57:33] o/ [08:57:45] klausman: could you merge this patch plz https://gerrit.wikimedia.org/r/c/operations/puppet/+/1063213? [09:06:55] on it [09:11:31] Danke Schön! [10:29:41] * klausman lunch [11:04:00] (03PS2) 10Ilias Sarantopoulos: langid: bump kserve to 0.13.1 [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1078605 (https://phabricator.wikimedia.org/T367048) [11:05:12] (03CR) 10Ilias Sarantopoulos: "I will have to do a load test to verify that everything is working find with the upgrade. Running it locally worked fine for the moment." [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1078605 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos) [12:08:19] (03CR) 10Kevin Bazira: "Thank you for your work on this, Ilias. I faced a dependency conflict when running this patch locally, and resolved it by updating `fastte" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1078605 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos) [12:34:44] FIRING: [2x] ErrorBudgetBurn: - https://alerts.wikimedia.org/?q=alertname%3DErrorBudgetBurn [12:37:55] added another silence for that one ^^^ [12:45:48] 06Machine-Learning-Team, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: Likely memory issue on ml-serv2001.codfw.wmnet - https://phabricator.wikimedia.org/T376706#10217459 (10klausman) 05Open→03Resolved a:03klausman Thanks! Machine is back in service. [12:50:08] (03CR) 10Ilias Sarantopoulos: "No I didn't face such an issue, I ran this through docker as shown in https://phabricator.wikimedia.org/P69600#279099" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1078605 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos) [12:53:24] klausman: I'm thinking of changing the target on this SLO ,but I need to take a look at the actual latencies to get a number for it [12:53:58] ack. [13:54:15] Good [13:54:18] Morning all [13:59:34] morning o/ [14:42:14] Heyo Chris [14:58:04] hi Chris o/ [15:04:22] (03CR) 10Ilias Sarantopoulos: "There is a 0.9.3 release in pypi made in June 24 https://pypi.org/project/fasttext/0.9.3/#history" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1078605 (https://phabricator.wikimedia.org/T367048) (owner: 10Ilias Sarantopoulos) [18:11:17] (03PS6) 10AikoChou: locust: entry for reference-risk model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1077310 (https://phabricator.wikimedia.org/T372405) [18:17:22] (03CR) 10Ilias Sarantopoulos: [C:03+1] locust: entry for reference-risk model (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1077310 (https://phabricator.wikimedia.org/T372405) (owner: 10AikoChou) [18:17:44] nighttyyy o/ [18:25:44] have a nice evening o/ [18:27:51] (03CR) 10AikoChou: [C:03+2] "Thanks for the review!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1077310 (https://phabricator.wikimedia.org/T372405) (owner: 10AikoChou) [18:28:31] (03CR) 10AikoChou: [V:03+2 C:03+2] locust: entry for reference-risk model [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1077310 (https://phabricator.wikimedia.org/T372405) (owner: 10AikoChou) [22:58:42] 06Machine-Learning-Team: ml-lab can't install rocm torch - https://phabricator.wikimedia.org/T376967 (10calbon) 03NEW