[08:36:56] (03PS1) 10Kevin Bazira: test: update outlink predictor test image to support latest ci tests [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/1091595 (https://phabricator.wikimedia.org/T360120) [09:52:44] FIRING: LiftWingServiceErrorRate: ... [09:52:44] LiftWing service has a high rate of non 2/3/400 error code responses - https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Alerts#LiftWingServiceErrorRate - https://grafana.wikimedia.org/d/G7yj84Vnk/istio?orgId=1&refresh=30s&var-cluster=eqiad%20prometheus/k8s-mlserve&var-namespace=revscoring-editquality-damaging&var-backend=ptwiki-damaging-predictor-default.%2A - https://alerts.wikimedia.org/?q=alertname%3DLiftWingServiceErrorRate [10:07:44] RESOLVED: LiftWingServiceErrorRate: ... [10:07:44] LiftWing service has a high rate of non 2/3/400 error code responses - https://wikitech.wikimedia.org/wiki/Machine_Learning/LiftWing/Alerts#LiftWingServiceErrorRate - https://grafana.wikimedia.org/d/G7yj84Vnk/istio?orgId=1&refresh=30s&var-cluster=eqiad%20prometheus/k8s-mlserve&var-namespace=revscoring-editquality-damaging&var-backend=ptwiki-damaging-predictor-default.%2A - https://alerts.wikimedia.org/?q=alertname%3DLiftWingServiceErrorRate [10:48:28] 06Machine-Learning-Team, 07sre-alert-triage: Alert in need of triage: HelmfileAdminNGPendingChanges (instance deploy1003:9100) - https://phabricator.wikimedia.org/T380024#10326003 (10LSobanski)