[07:21:50] <ozge_>	 Good morning
[08:16:30] <isaranto>	 Gunaydin!
[08:46:21] <georgekyz>	 morning
[09:00:13] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154 (10isarantopoulos) 03NEW
[11:56:34] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785789 (10gkyziridis)
[11:57:38] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785803 (10gkyziridis)
[11:59:31] <isaranto>	 thank you George for updating the task description <3
[12:00:58] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785811 (10gkyziridis)
[12:01:40] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785815 (10gkyziridis)
[12:04:37] <isaranto>	 georgekyz: the local cpu version is using a different image based on bookworm but the ones deployed on liftwing  staging (both cpu and gpu pods) use the same image https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-edit-check/tags/ 
[12:04:39] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785829 (10gkyziridis)
[12:04:49] <isaranto>	 so there enviromnets would be the same, correct?
[12:08:22] <georgekyz>	 yes I think this is correct. I stated that they are using different images in their blubber, but I do not think that environment is the issue. I think that somehow the calculation of the gradients is different on the gpu side.
[12:08:56] <georgekyz>	 Although, the model itself acts strange on non-peacock examples (even the cpu one).
[12:10:21] <georgekyz>	 this can be observed in the first Localhost example in this paste: https://phabricator.wikimedia.org/P75736
[12:17:15] <georgekyz>	 hmmmm.... The model probability is the confidence score for the outcome class of the corresponding text. 
[12:23:06] <isaranto>	 Ack, thanks for clarifying
[12:50:25] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785942 (10gkyziridis)
[13:06:46] <georgekyz>	 isaranto: Are we using the same pytorch version in localhost and staging ?
[13:08:56] <isaranto>	 georgekyz: yes.
[13:09:03] <georgekyz>	 ok
[13:09:14] <georgekyz>	 thnx
[13:09:19] <isaranto>	 staging image -> https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services/+/refs/heads/main/.pipeline/edit_check/blubber.yaml#3
[13:09:20] <isaranto>	 cpu requirements -> https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services/+/refs/heads/main/src/models/edit_check/model_server/requirements_cpu.txt#2
[13:16:30] <georgekyz>	 Is there any possibility to be a floating point precision issue on the gpu itself?
[14:30:13] <isaranto>	 ¯\_(ツ)_/¯
[14:30:58] <isaranto>	 seems wild if the difference is that big but this is just an assumption ofc
[15:11:21] <jinxer-wm>	 FIRING: SLOMetricAbsent: linkrecommendation-requests <no value> - https://slo.wikimedia.org/?search=linkrecommendation-requests   - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent
[15:26:04] <wikibugs>	 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10786444 (10gkyziridis) I tested the same model on `ml-lab1001` using gpu and I am getting exactly the same results as in the cpu version. Please check the results in the print...
[15:26:21] <jinxer-wm>	 RESOLVED: SLOMetricAbsent: linkrecommendation-requests <no value> - https://slo.wikimedia.org/?search=linkrecommendation-requests   - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent
[15:46:52] <wikibugs>	 06Machine-Learning-Team, 10Editing-team (Tracking): Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10786540 (10ppelberg)
[15:47:07] <wikibugs>	 06Machine-Learning-Team, 10Editing-team (Tracking): Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10786541 (10ppelberg)
[19:18:57] <wikibugs>	 06Machine-Learning-Team, 10EditCheck, 10VisualEditor, 07Chinese-Sites, 10Editing-team (Tracking): Prepare annotool for Peacock Check model evaluation - https://phabricator.wikimedia.org/T392324#10787152 (10SSalgaonkar-WMF)