[13:01:53] Hey! I am trying to migrate this query: https://grafana.wikimedia.org/goto/ARzG0tAHg?orgId=1 to use prometheus [13:02:15] This is what I came up with: https://grafana.wikimedia.org/goto/YHwGApAHR?orgId=1 but the numbers are very off [13:03:04] Any ideas on how to fix it? [13:03:38] The Graphite version is in RPM, while the Prometheus one is in RPS. Multiply the Prometheus value by 60 to convert it. [13:03:42] nemo-yiannis: [13:03:59] ha, ok thanks, checking [13:09:44] its getting closer but they still dont much exactly [13:10:19] https://usercontent.irccloud-cdn.com/file/jsLrMZyM/image.png [13:15:06] try to replace 2m0s with $__rate_interval [13:15:22] nemo-yiannis: [13:15:45] i am using rate internal [13:16:29] *interval [13:16:56] You're right, sorry [13:59:55] metrics are computed differently: StatsD exports already aggregated metrics, while Prometheus computes the rate using the counter's value. So, I think there could be a discrepancy. cwhite herron godog denisse: do you have any other ideas? The prom expression provided by Yiannis seems fine to me. [14:22:53] tappof: hi! your commit 70a0617e4d2e1ccd70719e10dc5d700f0ae2d7ac on the operations/alerts repo broke the documentation for me :) [14:23:32] docker: Error response from daemon: create .: volume name is too short, names should be at least two alphanumeric characters [14:23:32] vgutierrez: hey, I'll take a look [14:23:43] that's the error I'm getting on bookworm using docker.io [14:23:57] version 20.10.24+dfsg1-1+deb12u1 [14:27:59] FWIW replacing "." as the path with $(pwd) works for me [14:30:19] vgutierrez: yes, it is what I was testing in the background [15:27:38] nemo-yiannis: I checked the code and you'll need name="rcache" in the prometheus metric since you are comparing with MediaWiki.RevisionOutputCache.rcache.hit.rate [15:27:58] with sum(rate(mediawiki_RevisionOutputCache_operation_total{status="hit",name="rcache"}[$__rate_interval])) * 60 I get much more similar numbers [15:28:37] name="parsoid_rcache" being the other big hitter [15:30:43] thanks godog [15:30:46] looking [16:39:17] BTW, I'd like some help debugging why this alert test is failing: https://gerrit.wikimedia.org/r/c/operations/alerts/+/1135050 [16:39:26] * vgutierrez seriously puzzled [16:43:10] vgutierrez: Looking. [16:45:46] vgutierrez: I think I know what's going on, in your alert you have `sum by (key, instance)` so Prometheus generates requires those 2 labels but in the test the `exp_labels:` is missing the key label. [16:52:40] I'd suggest modifying your test like this so it includes the mandatory labels (key, instance). https://www.irccloud.com/pastebin/sFXU0GX3/ [17:25:03] will do. thx denisse [21:32:03] thx for taking care of that CR yourself denisse <3 [21:33:03] vgutierrez: Sure thing, anytime! I originally wanted to push to a different branch but I ended up pushing to yours, I'll write why I changed some stuff to get the tests working in the CR. :)