[00:45:35] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10608297 (10Scott_French) Many thanks to @jijiki for moving the shellbox-media migration forward today. I've confirmed again that there are no further `Shellbox server returned incorrec... [02:52:10] 06serviceops, 06Data-Engineering, 06Data-Engineering-Radar, 06Data-Platform-SRE, 10Event-Platform: Make eventstreams-internal available to WMF staff without an ssh tunnel - https://phabricator.wikimedia.org/T348763#10608493 (10Ottomata) [05:33:58] 06serviceops, 10function-orchestrator, 10Abstract Wikipedia team (25Q3 (Jan–Mar)), 05Wikifunctions Improve performance: increase CPU and Node heap limit? - https://phabricator.wikimedia.org/T385859#10608550 (10ecarg) TY @akosiaris! > What generates this ? The reason I am asking is cause if this stanza is... [05:40:24] 06serviceops, 10function-orchestrator, 10Abstract Wikipedia team (25Q3 (Jan–Mar)), 05Wikifunctions Improve performance: increase CPU and Node heap limit? - https://phabricator.wikimedia.org/T385859#10608557 (10ecarg) noting here that Orch CPU limit was upped today: https://gerrit.wikimedia.org/r/c/operatio... [12:40:37] 06serviceops, 10Citoid, 06Editing QA, 06Editing-team, and 4 others: Switchover plan from restbase to api gateway for Citoid - https://phabricator.wikimedia.org/T361576#10609666 (10Mvolz) [12:41:47] 06serviceops, 10Citoid, 06Editing QA, 06Editing-team, and 4 others: Switchover plan from restbase to api gateway for Citoid - https://phabricator.wikimedia.org/T361576#10609667 (10Mvolz) >>! In T361576#10520404, @Mvolz wrote: >>>! In T361576#10519300, @Ryasmeen wrote: >>>>! In T361576#10515182, @Mvolz wrot... [14:19:48] 06serviceops, 06collaboration-services, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes: Fix installed key in dependend helmfile releases - https://phabricator.wikimedia.org/T387837#10609997 (10JMeybohm) After trying a bit to do the right thing (with `conditions:` and `installed:` or `installedTempl... [14:19:49] the poor deploy2002 has a full /srv [14:19:49] PROBLEM - Disk space on deploy2002 is CRITICAL: DISK CRITICAL - free space: /srv 15981 MB (5% inode=71%) [14:19:50] :/ [14:20:02] while running the backport window so I am pausing [14:23:26] claime: ^ wdyt would be the best option for dealing with this? [14:28:14] I have solved it by removing some obsolete git-fat objects from /srv/deployment/analytics/refinery/.git/fat [14:28:27] but it seems the machine has been around 90% usage at /srv for sometime [14:28:46] which boils down to Docker images pilling up on that host. We talked about it yesterday during the releng meeting [14:28:54] :) [14:29:08] the alarm should clear soon. It is at 86% usage now [14:29:52] also the alarm is confusing 5% inode=71% looked like we had 71% inode usage :) [14:29:57] but that is free space [14:29:58] anyway :) [14:32:25] oh, that's annoying... looking [14:32:54] and I assume this'll solve itself because of the "this week was particularly image-rich" thing [14:51:26] kamila_: sorry, was afk. garbage collecting older images, currently gc is 7 days but to clear space we could go down to 3/4 [15:05:01] 06serviceops, 07Kubernetes: Add pod ip address blocks to staging-eqiad - https://phabricator.wikimedia.org/T386232#10610240 (10cmooney) >>! In T386232#10557269, @akosiaris wrote: >> Not sure if there is anything speaking against that, but I think it's more future proof than adding another /24 (cc @akosiaris).... [15:28:03] 06serviceops, 06SRE Observability: chartmuseum prometheus metrics cardinality spam - https://phabricator.wikimedia.org/T386808#10610333 (10fgiunchedi) 05Open→03Resolved >>! In T386808#10606601, @kamila wrote: >> Maybe let's drop `url` label for `404` CM metrics for now, it seems like a good enough solu... [15:35:21] 06serviceops, 10Deployments, 06Release-Engineering-Team: httpbb appserver test breaks deployment of the week due to a timeout parsing page - https://phabricator.wikimedia.org/T360867#10610414 (10hashar) Note, it is still happening ` 15:22:29 Executing check 'check_testservers_k8s-2_of_2' 15:23:14 Check '... [15:40:24] 06serviceops, 10Page Content Service, 07Code-Health-Objective, 10Content-Transform-Team (Work In Progress): Rollout more wikis: week 2 - https://phabricator.wikimedia.org/T388140 (10Jgiannelos) 03NEW [15:41:13] 06serviceops, 10Page Content Service, 07Code-Health-Objective, 10Content-Transform-Team (Work In Progress): Rollout more wikis: week 2 - https://phabricator.wikimedia.org/T388140#10610438 (10Jgiannelos) [16:04:34] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10610561 (10Jhancock.wm) we got these in this week. working on getting everything racked. @Clement_Goubert what's the numerical range for wikikube-ctrl and wikikube-worker for this... [16:19:23] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10610651 (10Clement_Goubert) For wikikube-worker: wikikube-worker2244-2329 For wikikube-ctrl: wikikube-ctrl2004-2005 [16:19:39] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2242-2329 - https://phabricator.wikimedia.org/T384970#10610655 (10Clement_Goubert) [16:20:00] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2244-2329, wikikube-ctrl2004-2005 - https://phabricator.wikimedia.org/T384970#10610658 (10Clement_Goubert) [16:25:27] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2244-2329, wikikube-ctrl2004-2005 - https://phabricator.wikimedia.org/T384970#10610699 (10Clement_Goubert) [16:25:33] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2248-2333, wikikube-ctrl2004-2005 - https://phabricator.wikimedia.org/T384970#10610700 (10Clement_Goubert) [16:26:34] 06serviceops, 06DC-Ops, 10ops-codfw: Q3:rack/setup/install wikikube-worker2248-2333, wikikube-ctrl2004-2005 - https://phabricator.wikimedia.org/T384970#10610704 (10Clement_Goubert) Sorry for all the in-place changes, I forgot we still had some servers to reinstall/rename in codfw. This range should be good. [16:52:19] 06serviceops, 10function-orchestrator, 10Abstract Wikipedia team (25Q3 (Jan–Mar)), 05Wikifunctions Improve performance: increase CPU and Node heap limit? - https://phabricator.wikimedia.org/T385859#10610811 (10akosiaris) I show note that per [Grafana](https://grafana.wikimedia.org/d/FEkiKFqVk/wikifunctions... [17:02:48] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10610847 (10Scott_French) I spent a bit of time looking at the PHP source code yesterday evening to try to understand what's happening with `post_max_size` - specifically, why we see th... [17:20:34] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: Q3:rack/setup/install wikikube-worker2248-2331, wikikube-ctrl2004-2005 - https://phabricator.wikimedia.org/T384970#10610899 (10Jhancock.wm) [18:52:42] Hi, i think we forgot to add a rule for changeprop to invalidate summaries on PCS: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1125207 [18:52:56] Subbu hhas already reviewed, i can i go ahead merge and deploy? [19:38:41] nemo-yiannis: +1'd [19:39:19] go ahead [23:03:25] 06serviceops, 13Patch-For-Review: Migrate production Shellbox variants to PHP 8.1 - https://phabricator.wikimedia.org/T377038#10611943 (10Scott_French) Thanks for flagging @Reedy! So, I'm confident this is unrelated to the migration of shellbox-constraints to 8.1 (for reference, that started on the 28th of Ja...