[15:33:02] 17:30:57 1. 46510ms to run UploadFromUrlTest::testSyncDownload [15:33:10] 46 seconds for one test case seems a bit long [15:46:19] Krinkle: JWT session cookies are ready to roll out, and they are a bit bulky (around 0.9K) so I'm trying to figure out what performance signals to watch [15:46:33] okay [15:46:58] it seems like NavigationTiming records an isAnon flag, but it gets discarded during the eventlogging -> prometheus transformation [15:47:20] broadly two things: rum and synthetic. rum is real-user with less detail but higher converage (everyone is included), synthetic has more detail but is specific to certain contexts [15:47:21] what do you think about turning that into a prometheus label? [15:48:22] https://gerrit.wikimedia.org/g/performance/navtiming/+/94fa387fa9d96fc6c4ce1d391bbaadb5865e5a74/navtiming/__init__.py#749 [15:48:40] hm.. yeah we lost some detail in the graphite>grafana transition [15:49:15] I would have expected the opposite since these were useless in graphite due to low sampling but in Prometheus, low sampling is not an issue since you can accurately aggregate over larger timespans [15:49:34] but we had to slim down a lot due to cardinality cost being high in Prometheus scraping [15:49:49] `auth` used to be a label [15:49:59] now we simplified it to mw_context with just three variants [15:50:10] we can add e.g. auth_mainspace_view? [15:50:56] yeah that would be very useful during various session work [15:52:09] the synthetic metrics are all anonymous, right? https://grafana.wikimedia.org/d/IvAfnmLMk/synthetic-testing-page-drilldown is very cool, but other than the login user journey, I don't see anmything authentication related [15:52:46] and login is probably too slow to be worth looking at [15:53:49] what's interesting is time to first byte changes on article views, especially edge cached article views, I think [15:54:47] I'll try to make a patch for the navtiming proxy then [16:00:46] no, we have a login journey with several pageviews [16:00:46] https://grafana-rw.wikimedia.org/d/d-pdqGBGdse/wikipedia-login-user-journey?var-function=median&orgId=1&from=now-7d&to=now&timezone=utc&var-browser=chrome&var-connectivity=4g [16:00:51] https://grafana-rw.wikimedia.org/d/IvAfnmLMk/synthetic-testing-page-drilldown?var-function=median&orgId=1&from=now-2d&to=now&timezone=browser&var-path=desktop&var-testtype=userJourneyLogin&var-group=en_wikipedia_org&var-page=_wiki_Barack_Obama&var-browser=chrome&var-connectivity=4g [16:01:18] LoginPage > login (submit/Main_Page) > Obama > Facebook [16:02:21] so we'll have TTFB, render times, html response size (inc headers) etc [16:02:39] the detail on in the drilldown one, with various collapsed sections [16:03:25] maybe we could add a beta cluster or group0 wiki to that. right now that one does not have variants afaik [16:03:39] oh, nice [16:03:54] well, it has variatns like mobile/desktop, firefox/chrome [16:04:00] but not different wikis [16:09:09] yeah, something in group 0 would be nice [16:09:28] beta differs in so many ways, I'm not sure I'd trust anything I see there [16:22:49] it seems the per-platform metrics still include authentication information, maybe that's good enough [16:22:52] https://gerrit.wikimedia.org/r/plugins/gitiles/performance/navtiming/+/94fa387fa9d96fc6c4ce1d391bbaadb5865e5a74/navtiming/__init__.py#770 [16:29:48] I believe that's statsd/graphite not Prometheus, no longer written I think. Unless that part is still read-write in graphite [16:30:21] Kind of like how copyToStatsd is a no-op [16:30:35] Kind of like how copyToStatsd is a no-op [20:38:34] o/ [20:39:04] I have decided to come back to #mediawiki-core after *cough* a long time :] [20:50:51] :o hi [21:19:06] welcome back :) [22:20:22] RoanKattouw: I believe you shared this with me originally, might come in handy. https://github.com/krinkle/dotfiles/commit/0b0ea5ac7d ref T404739 [22:20:22] T404739: "kube-env: command not found" when in GNU screen - https://phabricator.wikimedia.org/T404739