[07:21:50] Good morning [08:16:30] Gunaydin! [08:46:21] morning [09:00:13] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154 (10isarantopoulos) 03NEW [11:56:34] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785789 (10gkyziridis) [11:57:38] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785803 (10gkyziridis) [11:59:31] thank you George for updating the task description <3 [12:00:58] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785811 (10gkyziridis) [12:01:40] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785815 (10gkyziridis) [12:04:37] georgekyz: the local cpu version is using a different image based on bookworm but the ones deployed on liftwing staging (both cpu and gpu pods) use the same image https://docker-registry.wikimedia.org/wikimedia/machinelearning-liftwing-inference-services-edit-check/tags/ [12:04:39] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785829 (10gkyziridis) [12:04:49] so there enviromnets would be the same, correct? [12:08:22] yes I think this is correct. I stated that they are using different images in their blubber, but I do not think that environment is the issue. I think that somehow the calculation of the gradients is different on the gpu side. [12:08:56] Although, the model itself acts strange on non-peacock examples (even the cpu one). [12:10:21] this can be observed in the first Localhost example in this paste: https://phabricator.wikimedia.org/P75736 [12:17:15] hmmmm.... The model probability is the confidence score for the outcome class of the corresponding text. [12:23:06] Ack, thanks for clarifying [12:50:25] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10785942 (10gkyziridis) [13:06:46] isaranto: Are we using the same pytorch version in localhost and staging ? [13:08:56] georgekyz: yes. [13:09:03] ok [13:09:14] thnx [13:09:19] staging image -> https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services/+/refs/heads/main/.pipeline/edit_check/blubber.yaml#3 [13:09:20] cpu requirements -> https://gerrit.wikimedia.org/r/plugins/gitiles/machinelearning/liftwing/inference-services/+/refs/heads/main/src/models/edit_check/model_server/requirements_cpu.txt#2 [13:16:30] Is there any possibility to be a floating point precision issue on the gpu itself? [14:30:13] ¯\_(ツ)_/¯ [14:30:58] seems wild if the difference is that big but this is just an assumption ofc [15:11:21] FIRING: SLOMetricAbsent: linkrecommendation-requests - https://slo.wikimedia.org/?search=linkrecommendation-requests - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [15:26:04] 06Machine-Learning-Team: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10786444 (10gkyziridis) I tested the same model on `ml-lab1001` using gpu and I am getting exactly the same results as in the cpu version. Please check the results in the print... [15:26:21] RESOLVED: SLOMetricAbsent: linkrecommendation-requests - https://slo.wikimedia.org/?search=linkrecommendation-requests - https://alerts.wikimedia.org/?q=alertname%3DSLOMetricAbsent [15:46:52] 06Machine-Learning-Team, 10Editing-team (Tracking): Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10786540 (10ppelberg) [15:47:07] 06Machine-Learning-Team, 10Editing-team (Tracking): Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10786541 (10ppelberg) [19:18:57] 06Machine-Learning-Team, 10EditCheck, 10VisualEditor, 07Chinese-Sites, 10Editing-team (Tracking): Prepare annotool for Peacock Check model evaluation - https://phabricator.wikimedia.org/T392324#10787152 (10SSalgaonkar-WMF)