[00:10:30] Thanks to help and doc pointers from rzl the Toolhub cronjob is now cleaning up its pod after a successful run. There is one Error state Pod that I guess will hang out indefinitely as I configured failedJobsHistoryLimit: 1 in the job spec. [00:12:52] I would honestly not mind at all having more autonomous control over the Toolhub namespace in the wikikube cluster. Not being able to delete stale Pods and ReplicaSets is a minor annoyance in that they are clutter I need to think about when I look at the state of things with kubectl. [00:13:25] * bd808 is spoiled by Toolforge [00:16:40] bd808: if you `kube_env toolhub-deploy eqiad` you can indeed delete those things [00:19:06] rzl: w00t! I either didn't know about or had forgotten that there was a -deploy variant to swtich to credentials with more rights in the namespace. Thanks [00:19:24] you're welcome, use cautiously :) [09:03:33] hello folks! If anybody as time for a quick cpu requests/limits review https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1121315 [12:25:34] 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Epic: Pregeneration performance optimizations for PCS - https://phabricator.wikimedia.org/T386919 (10Jgiannelos) 03NEW [13:05:15] 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Epic: Pregeneration performance optimizations for PCS - https://phabricator.wikimedia.org/T386919#10567505 (10Joe) For the namespace translation problem, one approach could be to modify the events we get from MediaWiki to include the normalized... [13:13:20] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926 (10elukey) 03NEW [13:14:52] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10567544 (10elukey) [13:22:06] Filed another request for Wikikube capacity: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1121363 [13:22:16] of course for our dear Kartotherian :) [13:22:21] lemme know your thoughts [13:22:46] I can anticipate that I'll probably come back for more [13:33:10] 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Epic: Pregeneration performance optimizations for PCS - https://phabricator.wikimedia.org/T386919#10567617 (10Jgiannelos) [14:53:57] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10567921 (10elukey) [15:34:41] 06serviceops, 10Page Content Service, 10RESTBase Sunsetting, 07Epic: Pregeneration performance optimizations for PCS - https://phabricator.wikimedia.org/T386919#10568136 (10Jgiannelos) Using the following queries: ` SELECT COUNT(*) FROM (SELECT regexp_like(uri_path, '/api/rest_v1/page/mobile-html/(User_t... [16:01:39] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10568229 (10elukey) [16:04:57] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian): Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10568234 (10elukey) We are currently handling ~150 rps in both DCs, served by 10 pods (each DC). The latency is good overall, there are big... [16:14:09] 06serviceops, 06Content-Transform-Team, 10Maps (Kartotherian), 13Patch-For-Review: Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10568259 (10elukey) For safety I just reduced the k8s workers pooled from 3 to 2, until we get a bump in the overall c... [16:14:22] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian), 13Patch-For-Review: Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10568260 (10elukey) [16:18:42] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian), 13Patch-For-Review: Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10568284 (10elukey) [16:19:19] 06serviceops, 06Content-Transform-Team, 06Infrastructure-Foundations, 10Maps (Kartotherian), 13Patch-For-Review: Scale up Kartotherian on Wikikube and move live traffic to it - https://phabricator.wikimedia.org/T386926#10568288 (10elukey) p:05Triage→03Medium [16:29:50] Hi. We're planning to deploy changeprop momentarily, any objections? [16:52:49] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10568460 (10MatthewVernon) I've pushed both `dgit/bookworm` and `bookworm-wikimedia-bullseyebp` branches to gitlab, and CI has now built `10.42-1~wmf11+1` for... [16:56:15] arlolra: go ahead [16:58:39] thanks [17:06:49] hnowlan: done. It's expected there's no staging resources there, right? [17:07:14] 06serviceops, 06Release-Engineering-Team, 10Scap: OSError "Message too long" from scap helmfile diffs - https://phabricator.wikimedia.org/T386759#10568494 (10Scott_French) 05Open→03Resolved a:03Scott_French Ahmon has kindly deployed scap 4.137.0, which picks up https://gitlab.wikimedia.org/repos/re... [17:15:56] arlolra: they got axed in T386107 [17:17:13] Ah, ok, thank you [17:24:47] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, and 2 others: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T383213#10568572 (10Jhancock.wm) a:03VRiley-WMF [17:25:36] 06serviceops, 06DC-Ops, 10decommission-hardware, 10ops-eqiad, 06SRE: decommission mw[1349-1413] - https://phabricator.wikimedia.org/T375842#10568578 (10Jhancock.wm) a:03VRiley-WMF [17:53:16] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10568704 (10Scott_French) Great, thank you @MatthewVernon. So, my understanding is that as long as (1) the CI job is tagged `trusted` (I see this requires som... [17:55:39] 06serviceops, 13Patch-For-Review, 07PHP 8.1 support: Update PCRE in PHP 8.1 images to PCRE 10.39 or newer - https://phabricator.wikimedia.org/T386006#10568713 (10MatthewVernon) Yeah, I thought the relevant work had been done, but it turns out the people concerned had cheated a bit and used their own `.gitlab... [18:25:11] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, and 2 others: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T383341#10568841 (10Jhancock.wm) a:03Jhancock.wm