[06:59:25] good morning folks! [07:51:39] morning! :) [08:01:28] o/ [08:12:37] 早上好! [08:44:28] for some reason I coulnd't install kserve==0.13.1 in python 3.10 [08:45:20] I'm trying the ref-need with 3.11 now [08:45:32] I'm just pasting this in case anybody bumped into it as well https://phabricator.wikimedia.org/P68730 [09:00:24] nevermind about the above I reinstalled everything and it works in python 3.10. There is the issue with torch 1.13 and macos that we were discussing so I'll follow up with research about that [09:17:28] (03CR) 10Ilias Sarantopoulos: [C:03+1] "Nice work. Works like a charm!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070941 (https://phabricator.wikimedia.org/T371902) (owner: 10Kevin Bazira) [09:42:56] 06Machine-Learning-Team, 06DC-Ops, 10ops-codfw: hw troubleshooting: spinning disk failure for ml-serve2005.codfw.wmnet - https://phabricator.wikimedia.org/T374207 (10klausman) 03NEW [09:46:58] 06Machine-Learning-Team, 06DC-Ops, 10ops-codfw: hw troubleshooting: spinning disk failure for ml-serve2005.codfw.wmnet - https://phabricator.wikimedia.org/T374207#10124631 (10klausman) I already tried a reboot and a complete powercycle to revive the disk, to no avail. [10:10:58] * isaranto afk lunch! [11:24:02] isaranto: o/ I discussed this a bit with Muniza yesterday. We thought it would be a good topic for ml:research meeting to discuss how knowledge integrity should evolve. As it contains more and more models and becomes like a model repository, maintenance of dependencies will become more complex [11:25:00] sgtm! I'll add it as a topic in the agenda, I also added the article-country model [11:27:32] ack! [11:27:44] elukey: 下午好 :) [11:36:51] (03CR) 10AikoChou: [C:03+1] "Thanks for working on this! I ran it and it works without any issue :)" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070941 (https://phabricator.wikimedia.org/T371902) (owner: 10Kevin Bazira) [12:25:39] (03PS5) 10Ilias Sarantopoulos: articlequality: update output schema [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) [12:27:08] (03PS6) 10Ilias Sarantopoulos: articlequality: update output schema [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) [12:27:59] 06Machine-Learning-Team, 06Data-Platform-SRE, 06serviceops, 07Security: Migrate the ownership of DPE-Owned Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373534#10125430 (10Gehel) p:05Triage→03Medium [12:28:59] I updated the schema for the articlequality lang agnostic model. this is related to what I wrote in https://phabricator.wikimedia.org/T360455#10120161 [12:29:21] let me know if you think we should use another parameter name or do it in another way [12:34:12] 06Machine-Learning-Team, 06Infrastructure-Foundations, 06serviceops: Migrate the ownership of ML-Owned Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T374233 (10isarantopoulos) 03NEW [12:34:36] I created this --^ to track the work related to image ownership [12:38:13] 06Machine-Learning-Team, 06Infrastructure-Foundations, 06serviceops: Migrate the ownership of ML-Owned Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T374233#10125543 (10isarantopoulos) [12:40:20] (03CR) 10Isaac Johnson: [C:03+1] "just noting that I'm onboard with this change and standardizing of outputs. thanks ilias!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) (owner: 10Ilias Sarantopoulos) [12:41:52] I see Tobias has already changed the ownership for kserve/* images, nice1 [12:48:11] (03CR) 10AikoChou: [C:03+1] "LGTM!" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) (owner: 10Ilias Sarantopoulos) [13:27:57] (03CR) 10Kevin Bazira: [C:03+2] "Thanks for the reviews :)" [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070941 (https://phabricator.wikimedia.org/T371902) (owner: 10Kevin Bazira) [14:09:21] (03CR) 10Ilias Sarantopoulos: [V:03+2 C:03+1] Makefile: add support for reference-need [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070941 (https://phabricator.wikimedia.org/T371902) (owner: 10Kevin Bazira) [14:10:15] (03PS7) 10Ilias Sarantopoulos: articlequality: update output schema [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) [14:10:34] (03CR) 10Ilias Sarantopoulos: [C:03+2] articlequality: update output schema [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) (owner: 10Ilias Sarantopoulos) [14:14:54] (03Merged) 10jenkins-bot: articlequality: update output schema [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1070228 (https://phabricator.wikimedia.org/T360455) (owner: 10Ilias Sarantopoulos) [14:48:30] 06Machine-Learning-Team, 06Infrastructure-Foundations, 06serviceops: Migrate the ownership of ML-Owned Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T374233#10126062 (10isarantopoulos) [15:25:00] 06Machine-Learning-Team, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: spinning disk failure for ml-serve2005.codfw.wmnet - https://phabricator.wikimedia.org/T374207#10126245 (10Jhancock.wm) @klausman this one isn't under warranty and I don't have an exact match for the drive. will a 1.92Tb drive work... [15:35:15] 06Machine-Learning-Team, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: spinning disk failure for ml-serve2005.codfw.wmnet - https://phabricator.wikimedia.org/T374207#10126317 (10klausman) >>! In T374207#10126245, @Jhancock.wm wrote: > @klausman this one isn't under warranty and I don't have an exact m... [15:57:46] 10Lift-Wing, 06Machine-Learning-Team: Request to host the Reference Risk Model on LiftWing - https://phabricator.wikimedia.org/T372405#10126427 (10isarantopoulos) [16:28:57] going afk, have a nice weekend folks! [17:56:49] 06Machine-Learning-Team, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: spinning disk failure for ml-serve2005.codfw.wmnet - https://phabricator.wikimedia.org/T374207#10126769 (10Jhancock.wm) 05Open→03Resolved cool. I'll do that. thanks!