[10:51:50] 06serviceops, 13Patch-For-Review: Improve detection of kafka-main broker TLS certificate rotations - https://phabricator.wikimedia.org/T410552#11415265 (10Blake) Cool, we can now see this in https://thanos.wikimedia.org/alerts, searching for "KafkaRollingRestartRequired". I'll now start work on the Phab receiver. [11:38:31] 06serviceops, 06Infrastructure-Foundations, 07OKR-Work: rest gateway: Record x-trusted-request and x-provenance headers in access logs - https://phabricator.wikimedia.org/T411250 (10daniel) 03NEW [11:43:47] 06serviceops: hcaptcha proxy: bump connection limits + stress test - https://phabricator.wikimedia.org/T411141#11415458 (10Raine) [11:43:47] 06serviceops: WE6.2.6: ☂️ hcaptcha-proxy Production Readiness Review - https://phabricator.wikimedia.org/T410626#11415459 (10Raine) [11:43:48] 06serviceops: ☂️ [FY2025-26][Hypothesis] WE6.2.1 Production Readiness Checklist - https://phabricator.wikimedia.org/T400263#11415462 (10Raine) [11:53:56] 06serviceops, 06Traffic, 05WE4.2 Bot detection: hcaptcha-proxy health checks should also depool sites if their upstream is unreachable - https://phabricator.wikimedia.org/T411191#11415499 (10Raine) p:05Triage→03Low >>! In T411191#11413343, @ssingh wrote: > But there are more considerations than this. Giv... [11:57:38] 06serviceops: Improve hcaptcha-proxy alerting - https://phabricator.wikimedia.org/T411251 (10Raine) 03NEW [12:01:02] 06serviceops, 06Infrastructure-Foundations, 10SRE-tools, 13Patch-For-Review: Add a --rack flag to sre.k8s.pool-depool-node - https://phabricator.wikimedia.org/T410537#11415521 (10MLechvien-WMF) The new --rack argument will be mutually exclusive with the hosts query. The following changes have also been fa... [12:01:39] 06serviceops, 06Growth-Team, 10PageViewInfo, 06Content-Transform-Team (Work In Progress), and 3 others: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11415525 (10daniel) >>! In T410198#11378640, @hnowlan wrote: > All of the 10.192 addre... [12:25:40] 06serviceops, 06Growth-Team, 10PageViewInfo, 06Content-Transform-Team (Work In Progress), and 3 others: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11415578 (10akosiaris) >>! In T410198#11415525, @daniel wrote: >>>! In T410198#1137864... [12:26:10] 06serviceops: Improve hcaptcha-proxy Grafana dashboard - https://phabricator.wikimedia.org/T411254 (10Raine) 03NEW [12:27:13] 06serviceops: Improve hcaptcha-proxy alerting - https://phabricator.wikimedia.org/T411251#11415591 (10Raine) [12:29:38] 06serviceops: Monitor hCaptcha status - https://phabricator.wikimedia.org/T411255 (10Raine) 03NEW [12:35:34] 06serviceops, 06Growth-Team, 10PageViewInfo, 06Content-Transform-Team (Work In Progress), and 3 others: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11415638 (10daniel) >>! In T410198#11415578, @akosiaris wrote: > None of these IP belo... [12:37:27] 06serviceops: Draft hCaptcha SLOs, document SLIs - https://phabricator.wikimedia.org/T411256 (10Raine) 03NEW [12:39:08] 06serviceops, 06Infrastructure-Foundations, 10SRE-tools, 13Patch-For-Review: Add a --rack flag to sre.k8s.pool-depool-node - https://phabricator.wikimedia.org/T410537#11415666 (10MLechvien-WMF) With test-cookbook in dryrun @Raine and I tested following 6 test cases: - In cluster wikikube-eqiad, action c... [12:41:03] o/ [very much not urgent!] I would appreciate reviews for the patches on T404507 [12:41:36] (fake stashbot) T404507: Redirect legacy language codes for Toki Pona to tok.wikipedia.org - https://phabricator.wikimedia.org/T404507 [12:41:49] I always forget that bot is not here [12:44:32] 06serviceops: Draft hCaptcha SLOs, document SLIs - https://phabricator.wikimedia.org/T411256#11415701 (10Raine) [13:01:22] taavi: TIL toki pona :D [13:02:28] Raine: as you can see, I have been very thoroughly nerd-sniped about that :P [13:03:23] apparently :D [13:03:27] 06serviceops, 06Infrastructure-Foundations, 10SRE-tools, 13Patch-For-Review: Add a --rack flag to sre.k8s.pool-depool-node - https://phabricator.wikimedia.org/T410537#11415802 (10MLechvien-WMF) One more test case: a rack that does not exist in the cluster: `test-cookbook -c 1212089 --dry-run sre.k8s.pool-d... [13:04:17] * Raine promises this will not rekindle the interslavic nerdsnipe today [13:07:07] anyway, thanks for the reviews! I'll ship those out probably early next week [13:07:34] 👍 [14:22:13] 06serviceops, 06Growth-Team, 10PageViewInfo, 06Content-Transform-Team (Work In Progress), and 3 others: Determine the source of internal requests going through the API gateway. - https://phabricator.wikimedia.org/T410198#11416038 (10akosiaris) >>! In T410198#11415638, @daniel wrote: >>>! In T410198#1141557... [14:31:22] 06serviceops, 10envoy, 06SRE: Upgrade Envoy to v1.35.6 - https://phabricator.wikimedia.org/T410975#11416093 (10hashar) I have updated the [[ https://integration.wikimedia.org/ci/job/helm-lint/ | helm-lint jenkins job ]]. [15:23:22] 06serviceops, 10observability: Create a visual representation of where each service is active from, any given time - https://phabricator.wikimedia.org/T327663#11416262 (10MLechvien-WMF) Could we clarify more the requirements: - Scope: only `codfw` and `eqiad` DCs? all deployed services (available from Spicer... [17:51:59] 06serviceops, 10observability: Create a visual representation of where each service is active from, any given time - https://phabricator.wikimedia.org/T327663#11416513 (10Raine) Is anything still needed beyond the functionality in `sudo cookbook -d sre.discovery.datacenter status all`? That provides the follow...