[00:05:21] 10Machine-Learning-Team: Improving error message for Revertrisk models - https://phabricator.wikimedia.org/T351278 (10achou) [00:09:46] 10Lift-Wing, 10Machine-Learning-Team: Revertrisk models are unable to provide scores for single-revision pages - https://phabricator.wikimedia.org/T351021 (10achou) 05Openā†’03Resolved [00:25:36] 10Lift-Wing, 10Machine-Learning-Team: Revertrisk models are unable to provide scores for single-revision pages - https://phabricator.wikimedia.org/T351021 (10achou) Closed this task and opened a task to improve the error message. I also updated the model cards to be more explicit about this. ([[ https://meta.w... [01:05:22] (03PS5) 10Kevin Bazira: article-descriptions: add article-descriptions model server [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/970831 (https://phabricator.wikimedia.org/T343123) [01:22:13] (03CR) 10Kevin Bazira: article-descriptions: add article-descriptions model server (031 comment) [machinelearning/liftwing/inference-services] - 10https://gerrit.wikimedia.org/r/970831 (https://phabricator.wikimedia.org/T343123) (owner: 10Kevin Bazira) [07:49:37] morning! [07:55:43] 10Machine-Learning-Team: Increased latencies with Kserve 0.11.1 (cgroups v2) - https://phabricator.wikimedia.org/T349844 (10elukey) Next steps: * Work with Research on T350389 to move KI to xgboost 2.0.1 * Keep working with Yandex upstream on https://github.com/catboost/catboost/issues/2518, and hopefully get a... [08:52:10] Good morning good people! o/ [08:53:51] kalimera! [10:06:01] catching up on stuff! is there anything you want me folks to tackle right now? [10:06:43] tackle = try to do šŸ˜› [10:07:15] isaranto: yes take it easy :D [10:10:47] <3 [10:12:18] Welcome back, Ilias. Hope the road trip was fun, and jetlag isn't too bad [10:45:02] everything was nice :) jetlag is ok at the moment, I'm fully functional at the moment \o/ [10:45:31] ah, the shoe drops 2-3 days later :D [10:47:52] 10Machine-Learning-Team, 10Growth-Team, 10GrowthExperiments: importOresTopics script fails to import topics - https://phabricator.wikimedia.org/T350137 (10klausman) For technical/simplcity matters, the model_info query was simplified and doesn't support all the parameters anymore. I am not all that familiar... [10:53:57] elukey: regarding https://phabricator.wikimedia.org/T350762, I actually think the current state of the docs is better than having the script in there: less maintenance, and we can always help someone who can't figure it out themselves. Wdyt? [10:56:10] sure [10:56:22] if it is already provided and easy to use it seems ok [10:56:35] but please comment in the talk page as well whatever you decide [10:57:52] ack, will do [11:02:43] 10Machine-Learning-Team: Fix the Lift Wing documentation about how to decode the ACCESS TOKEN - https://phabricator.wikimedia.org/T350762 (10klausman) 05Openā†’03Resolved Added to the talk page: "I think the current state of the page (only mention that the token can be parsed with a JWT library, but no Python... [11:09:59] 10Machine-Learning-Team, 10Product-Analytics, 10User-Iflorez: Transient error while running lift wing topic model - https://phabricator.wikimedia.org/T351114 (10klausman) >>! In T351114#9331004, @mpopov wrote: > @klausman: would introducing a slight delay (e.g. 0.5s) between each of the 34K requests help? T... [11:21:13] 10Machine-Learning-Team, 10Product-Analytics, 10User-Iflorez: Transient error while running lift wing topic model - https://phabricator.wikimedia.org/T351114 (10klausman) Ah, correction: Mokhail mentions that the actual querying of LW is only done one request at a time (the Spark/Large bits are for getting t... [11:23:13] klausman: I checked the task and this bit is very suspicious [11:23:14] numer_of_workers_querying_liftwing = 40 [11:23:29] yes, I am entirely unsure how that code works [11:24:43] also the numbers provided are multiple ones and they are a big confusing in my opinion [11:24:49] it is mentioned [11:24:52] 1) 75K [11:24:57] 2) ~33k [11:25:09] 3) 1.5M (description) [11:25:21] 4) 7M (description) [11:26:15] I think they are running a smaller query set first, but the final one will be ~7M [11:26:37] so if we keep 240 qps as reliable value, 1.5M / 240 = 6250, that split by 3600 gives 1.7, more or less what Irene reported [11:26:46] As for the logs, logstash is giving me zero log lines for a-o in the last 15h, so I am probably using it wrong [11:27:07] have you checked the istio gateway dashboard in logstash? [11:27:28] yeah, that's what I am using, but there is so much _other_ stuff, that I can't even find a-o queries [11:28:11] Oh hang on, I was filtering on the wrong label [11:28:28] klausman: have you filtered for the right cluster in the top-left block? [11:29:03] https://logstash.wikimedia.org/app/dashboards#/view/138271f0-40ce-11ed-bb3e-0bc9ce387d88 Top left block? [11:29:20] "Dashboard panel: All Kubernetes Istio Ingress Gateaway Logs by Cluster [11:29:21] All Kubernetes Istio Ingress Gateaway Logs by Cluster" [11:29:22] ah, got it now [11:31:09] I presume `authority` is the way to filter for specific namespaces/pods? [11:31:26] or maybe path [11:32:59] or upstream_cluster [11:34:59] but anyway, there is no way that they are doing 240 qps from a single node :D [11:35:07] so that "40" is surely the parallelism [11:35:42] they said that the last try worked in 1.6 hours with, I guess, the 1.5M records [11:35:44] https://logstash.wikimedia.org/app/dashboards#/view/138271f0-40ce-11ed-bb3e-0bc9ce387d88?_a=h@bf6db5c Still zero queries in the last 15h [11:36:35] you need to use the permalinks, otherwise the link is not working (top rigth -> share -> etc..) [11:36:54] https://logstash.wikimedia.org/goto/15cdfa69170b90fda157b28c02f685f3 [11:38:19] going via kubectl on deploy2002, I do see plenty of queries in the logs with "path": "/v1/models/outlink-topic-model:predict" [11:38:46] two problems: [11:38:53] 1) the permalinks shows 15 mins and not hours [11:39:18] 2) you selected the codfw ml-serve cluster, that is not getting Irene's traffic (since it comes from eqiad to it gets mapped to eqiad) [11:39:45] if you set the correct options you'll see a UA with Irene's email [11:40:32] *sigh* the codfw thing was me being clever about "active DC" [11:40:54] as for the 15: the webpage shows "15h" to me in the time selector, not sure why the permalink doesn't capture that [11:42:19] Still puzzled why path doesn't work [11:44:36] https://logstash.wikimedia.org/goto/883f28dd06da2f1cc2f1ff27d05f1a68 I don't know what I am still doing wrong [11:47:24] going for lunch now. maybe my brain will work better when I am less grumpy at Logstash [11:48:33] klausman: https://logstash.wikimedia.org/goto/9c0e793d452ec6734d77d3f35613671d [11:48:37] probably the time range [11:48:47] with 2 days I can clearly see the results [11:49:58] https://phabricator.wikimedia.org/F41507591 [11:50:47] And even with two days time frame, nothing [11:52:53] ah. the UA is not just her email address [11:53:11] it also has a -- outlink example suffix [11:56:09] I never said that :) [11:57:18] * elukey lunch! [12:03:29] Good morning all [12:04:14] morning Chris! [12:07:46] Isaranto! How was the road trip [12:08:16] Greeeaaaat! [12:11:17] Nice! [12:15:24] and it is great to be back as well! [12:16:05] trying to dodge covid at the moment! [12:17:27] Iā€™m trying to dodge the cold both my kids have [12:20:17] sounds like a really difficult thing to do. hope you can make it šŸ¤ž [12:26:06] Me too [13:20:12] Everyone good? [13:27:19] yep from my side! [13:29:22] kevinbazira: I can also review the article-descriptions patch if you still want me to. I just don't want to cause any friction cause I'm late to the party :) [14:16:45] sure isaranto, no problem :) [14:27:37] 10Machine-Learning-Team, 10Add-Link, 10Growth-Team (Sprint 3 (Growth Team)), 10User-notice: Deploy "add a link" to 15th round of wikis - https://phabricator.wikimedia.org/T308141 (10Sgs) All wikis present results now. **Notes** `rnwiki` and `sgwiki` present a very low number of suggestions, 4 and 1 respe... [14:27:44] 10Machine-Learning-Team, 10Add-Link, 10Growth-Team (Sprint 3 (Growth Team)), 10User-notice: Deploy "add a link" to 15th round of wikis - https://phabricator.wikimedia.org/T308141 (10Sgs) [14:35:37] aaand to celebrate isaranto's return [14:35:38] https://github.com/kserve/kserve/releases/tag/v0.11.2 [14:35:39] :P [14:36:04] dangit :D [14:36:43] morning folks! [14:36:44] it is not extremely urgent but it contains a fix for the HTTP/2 CVE that hit cloudflare and others a while ago [14:36:47] hello aiko :) [14:52:23] 10Machine-Learning-Team, 10ORES: Add deprecation warnings to ORES-related repositories on Github - https://phabricator.wikimedia.org/T349632 (10klausman) p:05Triageā†’03Medium [15:34:13] 10Machine-Learning-Team, 10Add-Link, 10Growth-Team (Sprint 3 (Growth Team)), 10User-notice: Deploy "add a link" to 15th round of wikis - https://phabricator.wikimedia.org/T308141 (10Trizek-WMF) >>! In T308141#9334257, @Sgs wrote: >I think we can enable their frontends anyways wishing for the model to gene... [17:00:02] 10Machine-Learning-Team, 10Product-Analytics, 10User-Iflorez: Transient error while running lift wing topic model - https://phabricator.wikimedia.org/T351114 (10Iflorez) @klausman yes, that's the agent string used [17:04:13] o/ [17:04:23] have a nice rest of the day folks! [17:08:04] o/ bye luca! [17:10:27] I'm logging off as well folks, cu tomorrow! [17:34:41] 10Machine-Learning-Team, 10Growth-Team, 10GrowthExperiments: importOresTopics script fails to import topics - https://phabricator.wikimedia.org/T350137 (10Sgs) >>! In T350137#9333625, @klausman wrote: > For technical/simplcity matters, the model_info query was simplified and doesn't support all the parameter... [17:42:23] 10Machine-Learning-Team, 10GrowthExperiments, 10Growth-Team (Sprint 3 (Growth Team)), 10Patch-For-Review: importOresTopics script fails to import topics - https://phabricator.wikimedia.org/T350137 (10Sgs) [17:42:25] 10Machine-Learning-Team, 10GrowthExperiments, 10Growth-Team (Sprint 3 (Growth Team)), 10Patch-For-Review: importOresTopics script fails to import topics - https://phabricator.wikimedia.org/T350137 (10Sgs) a:05klausmanā†’03Sgs [17:42:42] 10Machine-Learning-Team, 10GrowthExperiments, 10Growth-Team (Sprint 3 (Growth Team)), 10Patch-For-Review: importOresTopics script fails to import topics - https://phabricator.wikimedia.org/T350137 (10Sgs) p:05Triageā†’03Lowest [22:31:08] 10Machine-Learning-Team, 10Add-Link, 10Growth-Team (Sprint 3 (Growth Team)), 10User-notice: Deploy "add a link" to 15th round of wikis - https://phabricator.wikimedia.org/T308141 (10Quiddity) Just to confirm for Tech News purposes: Is this releasing next week //even though there isn't a deployment train//...