[15:50:50] ryankemper: do you have a sense of the timeline to have the SLO in place? it may be the case that a cut over could be prioritized. in think in any event it'll be good to coordinate via a task as well [15:51:37] I'm having a look at the calendar to schedule a meeting and between TZ and ooo it's looking like starting async is the way to go [16:04:12] herron: o/ [16:04:19] https://thanos.wikimedia.org/rules#liftwing-requests-revscoring looks really good afaics :) [16:04:54] for some reason pyrra shows the old revscoring slos as well in https://slo.wikimedia.org/ [16:05:01] but it may be temporary [16:05:28] so if we restrict the istio metrics a lot they seem to collaborate with thanos rules [16:05:56] if everything looks good I can file another change to add more lift wing services [16:06:16] or better, I should try also the latency SLO [16:08:21] mmm no wait [16:08:29] the rules in https://thanos.wikimedia.org/rules#liftwing-requests-revscoring are the old ones [16:08:32] sgh [16:11:21] took the liberty to restart pyrra-filesystem on thanos nodes [16:32:37] herron: \o we've got https://phabricator.wikimedia.org/T338009 to cover the work in general but I can make a ticket that's more tailored to your guys' context [16:33:02] ryankemper: ok sounds good, I'll have a look at that as well thank you [16:33:20] elukey: hmm seeing an error in pyrra-filesystem having a look [16:34:10] as far as timeline, we were hoping to have a set of grizzly dashboards up in the next couple weeks but we'll see how realistic that is :P [16:34:39] herron: ah snap I see it [16:34:42] lemme check the patch [16:34:46] wrt cutover, is the intent to eventually have `mw.track` ultimately feeding into prometheus instead of graphite basically? [16:35:02] herron: fix incoming [16:35:06] elukey: thank you! [16:36:52] ryankemper: yeah the intent is to turn down graphite altogether in favor of prometheus, after migrating all our dependencies over [16:37:41] herron: sigh https://gerrit.wikimedia.org/r/c/operations/puppet/+/978633 my fault [16:38:18] elukey: no worries it needs improved tests [17:04:05] herron: ok now it works! [17:04:06] https://thanos.wikimedia.org/rules#liftwing-requests-revscoring [17:04:39] the only outlier is istio_requests:increase12w, that takes ~6 seconds [17:12:21] elukey: nice! [17:13:40] not sure if 6s is ok though