[06:30:46] Good morning [06:45:38] good morning! ☀️ [06:46:45] good morning o/ [07:33:55] good morning [07:47:38] 06Machine-Learning-Team: Simplify pre-commit hooks within inference-services repository. - https://phabricator.wikimedia.org/T393865 (10BWojtowicz-WMF) 03NEW [08:02:01] (03CR) 10Nik Gkountas: [C:03+2] Support for articlecountry [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) (owner: 10Sbisson) [08:03:38] (03Merged) 10jenkins-bot: Support for articlecountry [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1137342 (https://phabricator.wikimedia.org/T391230) (owner: 10Sbisson) [09:49:08] 06Machine-Learning-Team: [Fix]: Documentation for ORES and Mediawiki Docker - https://phabricator.wikimedia.org/T393876 (10gkyziridis) 03NEW [10:17:21] 06Machine-Learning-Team: [Fix]: Documentation for ORES and Mediawiki Docker - https://phabricator.wikimedia.org/T393876#10810806 (10gkyziridis) [11:40:16] Good afternoon! Team, any updates on the, https://phabricator.wikimedia.org/T391958 ? [11:47:07] isaranto: If I add the edit-check configuration under `experimental/values.yaml` and push it then we can deploy edit-check under experirmental but on prod? [11:49:24] yes! [11:51:12] dunkje [12:11:03] patch is ready for review: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1144521 [12:11:13] whenever have time cast an eye over it folks [12:18:28] ack! [12:19:10] georgekyz: I opened a WIP patch that demonstrates how to enable rrla model in the ores extension for idwiki https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1144526 [12:20:08] isaranto: that's very helpful, thank you [12:21:27] perhaps we need to break this patch into multiple ones -- I'll look into the steps we had discussed and I'll ask Jason as well [12:26:32] thnx for your time [12:27:28] 06Machine-Learning-Team: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10811226 (10Reedy) [12:27:35] Can I build blubber on ml-labs ? [12:29:05] 06Machine-Learning-Team: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10811229 (10Reedy) https://gerrit.wikimedia.org/g/mediawiki/core/+/HEAD/DEVELOPERS.md#running-commands Have you run `update.php` first? `docker compose exec mediawiki php maintenance/run.php... [12:29:13] 06Machine-Learning-Team, 07Documentation: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10811230 (10Reedy) [12:33:51] 06Machine-Learning-Team, 07Documentation: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10811285 (10gkyziridis) >>! In T393876#10811226, @Reedy wrote: > https://gerrit.wikimedia.org/g/mediawiki/core/+/HEAD/DEVELOPERS.md#running-commands > > Have you run `update... [12:40:46] 06Machine-Learning-Team, 07Documentation: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10811320 (10Reedy) I've updated the docs for the database update - https://www.mediawiki.org/w/index.php?title=MediaWiki-Docker%2FExtension%2FORES&diff=7627260&oldid=7616700... [12:41:02] I guess you should , wherever there is docker it should work [12:53:02] 06Machine-Learning-Team, 10Add-Link, 06Growth-Team: Make airflow-dag for addalink training pipeline output compatible with deployed model - https://phabricator.wikimedia.org/T388258#10811376 (10DMburugu) @fkaelin Is it possible to know if this work has a hypothesis planned for it any time soon? I remember th... [13:15:57] It expects a dockerfile [13:24:15] oh sorry my bad [13:57:43] (03PS3) 10Sbisson: Popular/search recommander: use domain code in lllang parameter [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1143605 (https://phabricator.wikimedia.org/T306508) [13:58:24] (03CR) 10CI reject: [V:04-1] Popular/search recommander: use domain code in lllang parameter [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1143605 (https://phabricator.wikimedia.org/T306508) (owner: 10Sbisson) [14:05:34] (03CR) 10Sbisson: "recheck" [research/recommendation-api] - 10https://gerrit.wikimedia.org/r/1143605 (https://phabricator.wikimedia.org/T306508) (owner: 10Sbisson) [14:43:57] klausman: I am getting a `failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: exit code: 2` when I am trying to build a blubber on ml-lab. Is there somehting that I haven't set up correctly ? [14:44:33] What command are you running? Also, 1001 or 1002? [14:47:10] `ml-lab1001`: `gkyziridis@ml-lab1001:~/inference-services$ docker build -f .pipeline/edit_check/blubber.yaml --target production --platform=linux/amd64 --progress=plain .` [14:47:57] Taking a look.... [14:48:06] thnx [14:54:03] 06Machine-Learning-Team, 10EditCheck, 10VisualEditor, 10Editing-team (Tracking): Compile list of templates, jargon and policies relevant to NPOV - https://phabricator.wikimedia.org/T389445#10812008 (10achou) [15:12:53] georgekyz: So I think the failure is due to buildx not being available on ml-lab (and Bookworm in general). WHat are you trying to do? [15:13:46] klausman: I am trying to build a different image for edit-check model using older pytorch-rocm images [15:14:51] I am doing it on ml-lab1001 because in that machine I took the correct results using the gpu [15:18:34] Yeah, I don't have an immediate solution for that. [15:47:40] isaranto: Two load tests with different configs for the current edit-check model on staging: https://phabricator.wikimedia.org/P75923 [15:48:06] isaranto: Let me know if we need something more specific on loading tests. [15:50:42] georgekyz: wow that is great, thanks! [15:51:06] isaranto: I also shared it with Sucheta [15:51:08] actually that is really low latency [15:51:46] it seems like we won't be needing the GPU for this after all [15:51:47] yeap [15:51:55] lets see [15:51:59] 🤞 [15:52:08] I run these locust tests from statbox10 [15:52:37] can you add a comment on the GPU task with these load tests? I'll make sure to flag it to Peter that we are set [15:53:58] thanks for jumping on that so quickly! [15:58:15] yeap I wil [16:02:12] 06Machine-Learning-Team, 10Editing-team (Tracking), 13Patch-For-Review: Peacock detection model GPU deployment returns inconsistent results - https://phabricator.wikimedia.org/T393154#10812566 (10gkyziridis) == Edit-check Loading Tests on Staging (CPU) == This is the current version of the edit-check service... [16:04:56] dankjewel! [16:22:31] 06Machine-Learning-Team, 07Documentation: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10812701 (10isarantopoulos) The [[ https://www.mediawiki.org/wiki/MediaWiki-Docker/Extension/ORES | ORES extension docker documentation ]] mentions > We run the following co... [16:25:00] going afk folks! have a nice evening [16:27:19] 06Machine-Learning-Team, 10EditCheck, 10VisualEditor, 10Editing-team (Tracking): Compile list of templates, jargon and policies relevant to NPOV - https://phabricator.wikimedia.org/T389445#10812728 (10Trizek-WMF) >>! In T389445#10812005, @achou wrote: > Hi, I noticed a discrepancy in terminology in the tas... [16:29:19] 06Machine-Learning-Team, 10EditCheck, 10VisualEditor, 10Editing-team (Tracking): Compile list of templates, jargon and policies relevant to NPOV - https://phabricator.wikimedia.org/T389445#10812729 (10Trizek-WMF) [16:30:27] 06Machine-Learning-Team, 10EditCheck, 10VisualEditor, 10Editing-team (Tracking): Compile list of templates, jargon and policies relevant to NPOV - https://phabricator.wikimedia.org/T389445#10812734 (10Trizek-WMF) I see that some languages (like French) have banners that are used for both article-level and... [16:30:50] 06Machine-Learning-Team, 07Documentation: [Fix]: Documentation for ORES and MediaWiki Docker - https://phabricator.wikimedia.org/T393876#10812740 (10Reedy) >>! In T393876#10812701, @isarantopoulos wrote: > The [[ https://www.mediawiki.org/wiki/MediaWiki-Docker/Extension/ORES | ORES extension docker documentati... [16:33:25] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 07Documentation: Consistently document $wg - https://phabricator.wikimedia.org/T393929 (10Reedy) 03NEW [16:34:03] 06Machine-Learning-Team, 10MediaWiki-extensions-ORES, 07Documentation: Consistently document $wg - https://phabricator.wikimedia.org/T393929#10812772 (10Reedy) [17:57:37] 06Machine-Learning-Team, 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install ml-serve101[23] - https://phabricator.wikimedia.org/T393948 (10RobH) 03NEW [17:59:03] 06Machine-Learning-Team, 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install ml-serve101[23] - https://phabricator.wikimedia.org/T393948#10813337 (10RobH) a:03klausman Please update the site.pp file with the insetup role for your team (detailed on https://wikitech.wikimedia.org/wiki/SRE/Dc-operations) an... [17:59:24] 06Machine-Learning-Team, 06DC-Ops, 10ops-eqiad, 06SRE: Q4:rack/setup/install ml-serve101[23] - https://phabricator.wikimedia.org/T393948#10813341 (10RobH) [19:07:49] FIRING: KubernetesDeploymentUnavailableReplicas: ... [19:07:49] Deployment reference-need-predictor-00010-deployment in revision-models at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revision-models&var-deployment=reference-need-predictor-00010-deployment - ... [19:07:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [19:20:21] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10813646 (10BCornwall) [19:22:49] 06Machine-Learning-Team, 10LDAP-Access-Requests, 06SRE, 10SRE-Access-Requests: Requesting access to analytics-privatedata-users & Kerberos identity & deployment POSIX group & ml-team-admins for Bartosz Wójtowicz - https://phabricator.wikimedia.org/T393595#10813654 (10BCornwall) a:03BCornwall [21:13:07] 06Machine-Learning-Team, 10Add-Link, 06Growth-Team: Make airflow-dag for addalink training pipeline output compatible with deployed model - https://phabricator.wikimedia.org/T388258#10813995 (10SSalgaonkar-WMF) Hi @DMburugu! Responsibility for this request has been transferred from the Research team to the M... [23:07:49] FIRING: KubernetesDeploymentUnavailableReplicas: ... [23:07:49] Deployment reference-need-predictor-00010-deployment in revision-models at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revision-models&var-deployment=reference-need-predictor-00010-deployment - ... [23:07:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas [23:12:49] RESOLVED: KubernetesDeploymentUnavailableReplicas: ... [23:12:49] Deployment reference-need-predictor-00010-deployment in revision-models at eqiad has persistently unavailable replicas - https://wikitech.wikimedia.org/wiki/Kubernetes/Troubleshooting#Troubleshooting_a_deployment - https://grafana.wikimedia.org/d/a260da06-259a-4ee4-9540-5cab01a246c8/kubernetes-deployment-details?var-site=eqiad&var-cluster=k8s-mlserve&var-namespace=revision-models&var-deployment=reference-need-predictor-00010-deployment - ... [23:12:49] https://alerts.wikimedia.org/?q=alertname%3DKubernetesDeploymentUnavailableReplicas