[09:35:53] sorry for the question a little naive, but how do I link a change-id in a commit message? I tried with the first N (7) chars but seems that gerrit does not perceive it as a link... ex: https://gerrit.wikimedia.org/r/c/operations/puppet/+/938807 [09:36:24] fabfur: all of it [09:36:29] fabfur: just put all of it [09:36:39] ok thanks! [09:36:41] not sure if it accepts a shorter version [09:45:34] I'm seeking a kind soul for +1 on https://gerrit.wikimedia.org/r/c/operations/puppet/+/938810 [09:46:46] godog: what's the possible downside? [09:47:59] XioNoX: prometheus get a little unhappy with the new influx of metrics, though I doubt it [09:48:09] and other issues we've already ironed out at this point [09:49:08] godog: maybe check with the team in charge of Prometheus? :) [09:49:22] * godog grabs mirror [09:49:34] we're all good! [09:49:43] I don't know the tool itself, but as long as the rollback is easy I'd say +1 [09:50:04] 20 -> 45 seems like a fine jump too [09:51:05] ack, yeah "rollback" is to set profile::prometheus::cadvisor::ensure: absent where we know it is causing problems essentially [09:51:21] thank you for taking a look XioNoX [09:53:19] for the curious, one way to check deployment progress is https://thanos.wikimedia.org/graph?g0.expr=count%20by%20(site)%20(cadvisor_version_info%7Bsite%3D~%22(eqiad%7Ccodfw)%22%7D)&g0.tab=0&g0.stacked=0&g0.range_input=30m&g0.max_source_resolution=0s&g0.deduplicate=1&g0.partial_response=0&g0.store_matches=%5B%5D [09:54:20] also the only mention of cadvisor in Wikitech is https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes#Monitoring with "This section needs a large update" [09:56:11] hah, thank you I've added an action item to T108027 [09:56:12] T108027: Collect per-cgroup cpu/mem and other system level metrics - https://phabricator.wikimedia.org/T108027 [09:57:09] <3 [09:57:47] wow, tasks for 2015, must be satisfying to make progress on it! [10:00:58] yeah it does, one of those things that fell off the prioritization constantly [16:02:36] herron: nothing to report from today [16:03:05] jbond: ack thanks! [16:10:55] hi oncallers! [16:11:10] as FYI today I started some kafka topic partition moves for https://phabricator.wikimedia.org/T341558 in kafka main-codfw [16:11:30] so far all good, changeprop and job-queues look fine [16:11:38] but if you see anything weird, ping me [16:12:26] one thing that we may need to do is roll restart the pods in k8s codfw for changeprop and jobqueues, if there is any report of weird behavior (in the past the changeprop's kafka client didn't like too much changes)