[07:01:17] 06serviceops, 10Data-Platform-SRE (2025.07.05 - 2025.07.25), 10SRE Observability (FY2025/2026-Q1): Ensure DPE SRE can receive alerts for applications hosted in wikikube - https://phabricator.wikimedia.org/T398073#10982077 (10fgiunchedi) Thank you for the additional context @BTullis; with that in mind I'd say... [10:25:16] 06serviceops, 10MW-on-K8s, 10Data-Platform-SRE (2025.07.05 - 2025.07.25), 10Discovery-Search (2025.06.13 - 2025.07.04), 10MW-1.45-notes (1.45.0-wmf.8; 2025-07-01): Investigate EQIAD daily completion suggester rebuild failure - https://phabricator.wikimedia.org/T395465#10982877 (10Clement_Goubert) >>! In... [11:33:17] 06serviceops, 06DBA: Expose tables catalog in noc - https://phabricator.wikimedia.org/T398943 (10Ladsgroup) 03NEW [11:50:08] 06serviceops, 06Data-Platform-SRE, 06Machine-Learning-Team, 10Prod-Kubernetes, 07Kubernetes: Update kserve to v0.13.0 on ML clusters - https://phabricator.wikimedia.org/T380722#10983317 (10isarantopoulos) @klausman Shall we rename this task and switch to a newer version? A candidate could be the latest v... [11:57:43] 06serviceops, 06Data-Platform-SRE, 06Machine-Learning-Team, 10Prod-Kubernetes, 07Kubernetes: Update kserve to v0.13.0 on ML clusters - https://phabricator.wikimedia.org/T380722#10983360 (10klausman) >>! In T380722#10983317, @isarantopoulos wrote: > @klausman Shall we rename this task and switch to a newe... [11:59:13] 06serviceops, 06DBA: Expose tables catalog in noc - https://phabricator.wikimedia.org/T398943#10983372 (10taavi) [12:00:10] 06serviceops, 06Data-Platform-SRE, 06Machine-Learning-Team, 10Prod-Kubernetes, 07Kubernetes: Update kserve to v0.15.2* on ML clusters - https://phabricator.wikimedia.org/T380722#10983373 (10klausman) [12:02:23] 06serviceops, 06Data-Platform-SRE, 06Machine-Learning-Team, 10Prod-Kubernetes, and 2 others: Update knative-serving+net-istio to v1.12.x on ML clusters - https://phabricator.wikimedia.org/T380723#10983382 (10isarantopoulos) [12:22:35] hi hnowlan, we're experiencing lag while consuming data from kafka-logging (https://grafana.wikimedia.org/goto/Y_hrssyHR?orgId=1). The lag seems to be related to this patch: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1167160, as the timing matches. Would it be possible to revert the patch? [12:22:37] 06serviceops, 13Patch-For-Review: Update api-gateway ratelimit version - https://phabricator.wikimedia.org/T388804#10983491 (10hnowlan) We're now using the latest HEAD of the ratelimit service on bullseye and have removed the prometheus-statsd-exporter from the api-gateway deployment. [12:23:40] Or maybe we could consider sampling these logs: https://logstash.wikimedia.org/goto/2ebf02662482cb35300a0a8ef3b9d6c1 [12:24:26] tappof: yeah we can revert for sure [12:24:35] thx hnowlan [12:27:07] tappof: I think the log output generated by testing is the issue rather than the change deployed - to stop the immediate impact I've turned the log level down [12:27:12] sorry for the disruption [12:27:26] 06serviceops, 06DBA: Expose tables catalog in noc - https://phabricator.wikimedia.org/T398943#10983531 (10Ladsgroup) p:05Triage→03Medium [12:35:11] ok hnowlan, the metrics are now going down. thanks.. [12:37:28] 06serviceops, 06DBA, 10noc.wikimedia.org: Expose tables catalog in noc - https://phabricator.wikimedia.org/T398943#10983571 (10A_smart_kitten) [12:54:20] 06serviceops: Package Wikimedia's PHP 8.1 component for bookworm - https://phabricator.wikimedia.org/T397075#10983608 (10Jdforrester-WMF) @Scott_French, do you know if this work is intended? [13:54:36] 06serviceops, 06Data-Platform-SRE, 06Machine-Learning-Team, 10Prod-Kubernetes, and 2 others: Update kserve to v0.15.2* on ML clusters - https://phabricator.wikimedia.org/T380722#10984012 (10isarantopoulos) [13:54:55] 06serviceops, 06Data-Platform-SRE, 06Machine-Learning-Team, 10Prod-Kubernetes, and 3 others: Update knative-serving+net-istio to v1.12.x on ML clusters - https://phabricator.wikimedia.org/T380723#10984018 (10isarantopoulos) [14:22:04] 06serviceops, 10Deployments, 10Release-Engineering-Team (Radar), 07Wikimedia-production-error: httpb sometimes fails upon deployment with a HTTP 503 - https://phabricator.wikimedia.org/T380958#10984246 (10akosiaris) a:03akosiaris No new reports, I 'll resolve, feel free to reopen. [14:22:22] 06serviceops, 10Deployments, 10Release-Engineering-Team (Radar), 07Wikimedia-production-error: httpb sometimes fails upon deployment with a HTTP 503 - https://phabricator.wikimedia.org/T380958#10984248 (10akosiaris) 05Open→03Resolved