[10:19:46] hmm, getting a bit confused about the clouddb-services project [10:20:26] yes? [10:20:40] is that project still used? [10:20:49] there's only a puppetmaster, and one instance left [10:21:17] clouddb-wikireplicas-query-1, created by brooke [10:22:06] certainly less than previously. I'm not sure how useful the query sampler is atm, aiui d.r0ptp4kt used that during their wiki replica analysis work and the previous time it was used was during the last wiki replica rearchitecture [10:23:20] that's ok, as long as someone still uses it, I'll try to clear the puppet ca cert alert [10:23:58] hmm, there's some leftover hiera entries in the cloud-instance git repo for that project too [10:24:43] so that project was historically used for both toolsdb and the wiki replica proxies, both of those have been moved to other places since [10:26:26] good to know :), what does the query sampler do? [10:28:29] my understanding is limited to "it samples the queries" [10:28:47] so it saves data about the various types of queries ran on the replicas to help understand usage patterns [10:30:29] that's interesting, how does it get the samples? Does it run an agent in the replicas of some sort? [10:34:53] i think it just does `show processlist;` with an account that can see data for all users with that [10:36:18] makes sense [10:42:15] hmm, our puppetmaster VMs, carry over some ca certs from other (previous) instances, ex /var/lib/puppet/server/ssl/ca/ca_crt.pem on tools-puppetmaster-02 and on clouddb-services-puppetmaster-2 actually belong to the -1 instances [10:44:06] https://www.irccloud.com/pastebin/GUqSR6KO/ [10:45:11] hmm, are we using those at all? [10:49:31] yes we are, our puppetmasters use the CA generated by the old puppetmaster -01 instances :/ [10:59:17] if anyone has more info on how (if any) puppet ca certs are renewed/generated please add here T355410 [10:59:18] T355410: [clouddb-service-puppetmaster-2] Renew puppet CA certificates - https://phabricator.wikimedia.org/T355410 [13:46:25] Do we have any running examples of a k8s cluster having data scraped by prometheus? [13:46:49] toolforge? [13:47:29] So that collects various usage data on the cluster and how many pods are running things like that? [13:48:34] yeah, we run https://github.com/kubernetes/kube-state-metrics for that [13:49:21] Is that different than the metrics server? [13:52:23] yes. iirc metrics-server collects data for kubernetes itself to use for autoscaling and some kubectl commands like `kubectl top node`, it does not integrate with prometheus at all [13:53:58] that's my recollection also [13:54:08] Sounds good. Do we have any documentation on how prometheus is scraping from toolforge (Both in what is setup on the prometheus side and the toolforge side)? [13:54:11] we deploy both https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/blob/main/components/wmcs-k8s-metrics/helmfile.yaml?ref_type=heads [14:00:06] we have some very limited docs https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes#Monitoring that I just expanded a bit too [14:01:00] Thanks [14:01:00] yep, I think puppet might be the best source of examples on how to do it [14:03:17] https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/toolforge/prometheus.pp#263 [14:03:38] and https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/toolforge/prometheus.pp#348 [15:37:17] taavi: can I merge https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/180 ? [15:37:29] dcaro: yes, thanks [16:16:15] btw taavi, I pinged John about T345610 -- he's probably out today due to weather but will move the server on Monday. And he didn't notice the task update before, so next time a nag on irc is probably warranted. [16:16:15] T345610: cloudrabbit: connect them via cloudsw and cloud-private - https://phabricator.wikimedia.org/T345610 [16:18:33] ok! thanks [16:47:59] * dcaro off [22:41:10] anyone available to +1 T355453 ? [22:41:11] T355453: Quota increase for reading-web-staging - https://phabricator.wikimedia.org/T355453