[08:06:08] 10Traffic, 10Data Pipelines, 10Data-Engineering-Planning, 10Foundational Technology Requests, and 2 others: Add a webrequest sampled topic and ingest into druid/turnilo - https://phabricator.wikimedia.org/T314981 (10elukey) 05Open→03Resolved a:03elukey Closing the task since nobody opposed to my earl... [10:29:33] godog: any idea why https://grafana.wikimedia.org/goto/vIDVSGKVk?orgId=1 doesn't work on thanos but works as expected on the global eqiad instance? [10:38:06] checking [10:39:02] vgutierrez: yeah because site_job_backend:trafficserver_backend_requests:avail5m is calculated only on the global instance, and since thanos effectively acts as a global instance we don't query 'global' [10:39:40] ack [10:39:41] we're migrating off the global instance though, in which case we'll need to port the recording rules to 'ops' instance in this case [10:40:09] I've spotted it after migrating https://grafana-rw.wikimedia.org/d/000000479/cdn-frontend-traffic?orgId=1 to thanos [10:40:17] task is T288196 FWIW [10:40:18] T288196: Retire Prometheus 'global' instance - https://phabricator.wikimedia.org/T288196 [10:40:42] interesting, yeah I'm happy to assist in moving those rules off 'global' [10:41:59] vgutierrez: if you have a list of metrics that broke I'll send a patch for those [10:43:28] basically we need to migrate the traffic metrics defined on rules_global.yml [10:45:01] and I'd say that nginx ones can be safely deprecated [10:45:40] *nod* ok [10:46:39] I'm happy to trade that work for reviews on https://gerrit.wikimedia.org/r/q/topic:bug%252FT323913 :D [10:48:04] * vgutierrez has been tricked [10:48:34] * vgutierrez looking [10:48:40] heheh it is a negotiation, you can say no and that's fine too ! [10:48:51] thank you though [11:14:56] ok I think I got it https://gerrit.wikimedia.org/r/c/operations/puppet/+/861825 [11:15:45] after merge it'll take some time to accumulate data, but after that dashboards can be switched [11:17:33] can we do that in two stages? [11:17:51] add the new ones today and remove the old ones by the end of the week? [11:18:47] ah yeah totally [11:18:50] will amend [11:20:48] cheers [11:27:41] ok my dns patch wasn't correct, in the sense that I ran into the same problem as T263518 [11:27:41] T263518: dns repository left in a broken state - https://phabricator.wikimedia.org/T263518 [11:27:44] will amend [12:00:55] Hey vgutierrez thanks for the review. Regarding https://gerrit.wikimedia.org/r/c/operations/puppet/+/861821 I don't have +2 permissions for puppet, can you also merge the change ? [12:01:32] sure [12:01:41] thanks! [12:01:55] (done) [16:05:18] 10netops, 10Infrastructure-Foundations, 10SRE: ICMPv6 'TTL Exceeded' messages are not generated by row E/F switches due to loopback filter - https://phabricator.wikimedia.org/T324033 (10cmooney) p:05Triage→03Medium [16:11:48] 10netops, 10Infrastructure-Foundations, 10SRE: ICMPv6 'TTL Exceeded' messages are not generated by row E/F switches due to loopback filter - https://phabricator.wikimedia.org/T324033 (10cmooney) [16:28:43] 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: ICMPv6 'TTL Exceeded' messages are not generated by row E/F switches due to loopback filter - https://phabricator.wikimedia.org/T324033 (10cmooney) For the record after comparing the loopback fitlters on lo0.0 (common-loopback.pol) and lo0.... [17:56:38] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10RobH) [18:29:06] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp5021.eqsin.wmnet with OS buster [18:42:28] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqsin, 10Patch-For-Review: Q2:rack/setup/install eqsin refresh - https://phabricator.wikimedia.org/T322048 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp5021.eqsin.wmnet with OS buster executed with errors: - cp5021 (**...