[10:16:55] 10Machine-Learning-Team, 10ORES, 10MediaWiki-Core-Preferences, 10Moderator-Tools-Team (Kanban), 10Patch-For-Review: When ORES quality filters are selected in mobile web, entries should be highlighted - https://phabricator.wikimedia.org/T314026 (10Samwalton9) [10:19:21] 10Machine-Learning-Team, 10ORES, 10MediaWiki-Core-Preferences, 10Moderator-Tools-Team (Kanban), 10Patch-For-Review: When ORES quality filters are selected in mobile web, entries should be highlighted - https://phabricator.wikimedia.org/T314026 (10Samwalton9) [10:19:47] 10Machine-Learning-Team, 10ORES, 10Growth-Team, 10MediaWiki-Recent-changes, 10Moderator-Tools-Team (Kanban): 'Highlight likely problem edits' preference doesn't select any filters in mobile web - https://phabricator.wikimedia.org/T318683 (10Samwalton9) [10:35:40] Morning everyone (sortof) [12:15:42] Syncing articletopic in serve-eqiad (for ndots config) now [12:18:04] new pods are up. [12:18:48] I'll proceed with the remaining pods/sets in alphabetical order [12:46:37] All pods in serve-codfw synced. DNS error rate is dropping, but not quite stable yet. Will keep an eye on it. [12:51:37] heads up: my ISP's local pop is having trouble. I am unaffected, but might drop out of the meeting(s) suddenly [12:55:00] Hi all! [12:55:07] \o [12:55:53] https://grafana.wikimedia.org/d/-sq5te5Wk/kubernetes-dns?orgId=1&var-dc=codfw%20prometheus%2Fk8s-mlserve&refresh=30s&viewPanel=36&from=now-1h&to=now looks very nice [12:57:06] (quick check in since I was very curious, the graphs looks really nice :) [12:57:44] Some pods in rs-eq-* were not restarted, I presume since their charts were not affected [12:58:19] I tested wrk and now I see similar rps on ml-serve-codfw too for articlequality [12:58:23] ~50rps [12:58:44] SO serve and staging in codfw are comparable now? [13:00:09] seems so yes [13:00:18] the pressure on coredns pods went down as well [13:01:53] we can probably do a little more [13:03:38] Yeah, agreed. I think the TTL might be a good angle if we can find a sweet spot that is better than 5s [13:05:18] gooood [13:05:22] going afk again, ttl! [13:05:26] \o [13:06:23] the error rate is now well below the no-error rate, which in turn has increased only mildly. [13:06:32] about 3k qps no-error, and <2.3k error/s. [13:08:35] https://thanos.wikimedia.org/graph?g0.expr=irate(coredns_dns_request_count_total%7Bprometheus%3D%22k8s-mlserve%22%2C%20site%3D%22codfw%22%7D%5B5m%5D)&g0.tab=0&g0.stacked=0&g0.range_input=6h&g0.max_source_resolution=0s&g0.deduplicate=1&g0.partial_response=0&g0.store_matches=%5B%5D is interesting as well [13:08:40] (now I go away for real :D) [13:08:53] shoo! shoo! [13:10:02] Yes, better diversity in DNS queries. We'll have to see if that stays on [14:46:46] serve and staging in codfw are comparable now > woohoo! that's really great! \o/ [15:20:07] 10Lift-Wing, 10Machine-Learning-Team, 10Epic, 10Research (FY2022-23-Research-July-September): Create a language agnostic model to predict reverts on Wikipedia - https://phabricator.wikimedia.org/T314385 (10diego) **Updates** * @MunizaA had written the model to be hosted in Liftwing and shared with @achou....