[08:58:19] 06serviceops, 10Observability-Logging, 13Patch-For-Review: Logs from containers sometimes not visible in logstash - https://phabricator.wikimedia.org/T357616#9716817 (10JMeybohm) Updating (restarting) rsyslog in wikikube codfw again led to quite a bump in events followed by a `(LogstashKafkaConsumerLag) firi... [09:31:46] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic: Move 100% of external traffic to Kubernetes (excluding Votewiki and Commons) - https://phabricator.wikimedia.org/T362323#9717001 (10jijiki) [09:35:43] 06serviceops, 10MoveComms-Support, 10MW-on-K8s, 06SRE, 06Traffic: Move 100% of external traffic to Kubernetes (excluding Votewiki and Commons) - https://phabricator.wikimedia.org/T362323#9717014 (10jijiki) [09:53:44] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team, 10Scap: Find a way to stage updated PHP packages on wikikube - https://phabricator.wikimedia.org/T362628 (10Clement_Goubert) 03NEW [11:00:32] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube control planes to hardware nodes - https://phabricator.wikimedia.org/T353464#9717371 (10JMeybohm) p:05Medium→03High [11:02:24] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Allow to address Kubernets API servers from NetworkPolicy - https://phabricator.wikimedia.org/T287491#9717362 (10JMeybohm) p:05Low→03High We should prioritize {T353464} because of {T358936}. Would be nice to have this done to lower conf... [11:03:36] 06serviceops, 10Prod-Kubernetes, 06SRE: Kubernetes apiserver probe failures on restart - https://phabricator.wikimedia.org/T358936#9717390 (10JMeybohm) We had this happening again in eqiad today because of a (planned) apiserver safe restart. We'll prioritize {T353464} to give more resources to wikikube apise... [11:04:40] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube control planes to hardware nodes - https://phabricator.wikimedia.org/T353464#9717393 (10JMeybohm) a:03JMeybohm [11:04:53] 06serviceops, 10Prod-Kubernetes, 06SRE: Kubernetes apiserver probe failures on restart - https://phabricator.wikimedia.org/T358936#9717389 (10jcrespo) Hi, today we had another occurrence of this. We didn't consider it a full-blown incident due to the no direct (or almost no) impact on users during the servic... [11:17:57] 06serviceops, 06SRE, 07Epic, 13Patch-For-Review: Phase out cergen for ServiceOps services - https://phabricator.wikimedia.org/T360636#9717440 (10hnowlan) [11:18:07] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube control planes to hardware nodes - https://phabricator.wikimedia.org/T353464#9717437 (10JMeybohm) [12:12:39] 06serviceops, 10Shellbox, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: [SW] [WBQC] shellbox-constraints returning 500 on preg_match error - https://phabricator.wikimedia.org/T362084#9717675 (10Lucas_Werkmeister_WMDE) **Prio Notes:** | Impact Area | Affected | |----------... [12:28:43] 06serviceops, 10Prod-Kubernetes, 13Patch-For-Review: PodSecurityPolicies will be deprecated with Kubernetes 1.21 - https://phabricator.wikimedia.org/T273507#9717753 (10JMeybohm) [12:29:48] 06serviceops, 10Prod-Kubernetes, 13Patch-For-Review: PodSecurityPolicies will be deprecated with Kubernetes 1.21 - https://phabricator.wikimedia.org/T273507#9717773 (10JMeybohm) [13:25:27] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube control planes to hardware nodes - https://phabricator.wikimedia.org/T353464#9718014 (10JMeybohm) [13:28:22] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Migrate wikikube control planes to hardware nodes - https://phabricator.wikimedia.org/T353464#9718024 (10JMeybohm) [13:54:22] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, 10Release-Engineering-Team (Seen): Rename X-Wikimedia-Debug k8s-experimental option - https://phabricator.wikimedia.org/T362662 (10Clement_Goubert) 03NEW [13:54:30] 06serviceops, 10MW-on-K8s, 06SRE, 06Traffic, 10Release-Engineering-Team (Seen): Rename X-Wikimedia-Debug k8s-experimental option - https://phabricator.wikimedia.org/T362662#9718155 (10Clement_Goubert) p:05Triage→03Low [14:00:26] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9718199 (10Clement_Goubert) The following images fail `docker-reporter` checks because they haven't been rebuilt on top of the new buster base i... [14:32:25] swfrench-wmf: akosiaris: I've been making a few teeny patches to conftool to bring things a bit more up to date, most notably is https://gerrit.wikimedia.org/r/c/operations/software/conftool/+/1020224 [14:34:01] if you are interested in learning more about conftool internals swfrench-wmf I very much support your involvement 😅 [14:42:32] cdanis: thanks! I've spent a fair bit of time digging around, but it would be good to put the accumulated opinions to use :) [14:42:39] 06serviceops, 10Observability-Logging, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Enable audit logging for kube-apiserver - https://phabricator.wikimedia.org/T290020#9718463 (10Jelto) [14:55:37] 06serviceops, 10VPS-project-Codesearch, 13Patch-For-Review: Add docker production images repo to codesearch - https://phabricator.wikimedia.org/T362567#9718568 (10Scott_French) operations/docker-images/production-images is now available in codesearch. [14:55:40] 06serviceops, 10VPS-project-Codesearch, 13Patch-For-Review: Add docker production images repo to codesearch - https://phabricator.wikimedia.org/T362567#9718569 (10Scott_French) 05Open→03Resolved [14:58:00] swfrench-wmf: totally. I don't have strong opinions fwiw aside from that porting the conftool regression tests to be v3 compatible would be a good start [15:00:02] thanks for fixing this - I ran into similar problems when putting together some mildly ad-hoc integration tests for etcd-mirror [15:28:04] yeah I have a simple feature request for dbctl and so far there's just been three mildly-aggravating patches to fix up basic stuff [15:32:14] 06serviceops: Provide nodejs20 base images for production - https://phabricator.wikimedia.org/T362681 (10Jdforrester-WMF) 03NEW [16:18:39] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9719270 (10Jdforrester-WMF) This has also broken building CI images. Will have to migrate them to bullseye immediately, I suppose. [18:26:17] 06serviceops, 10Prod-Kubernetes, 13Patch-For-Review: PodSecurityPolicies will be deprecated with Kubernetes 1.21 - https://phabricator.wikimedia.org/T273507#9719999 (10JMeybohm) [20:20:57] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717 (10Scott_French) 03NEW [20:55:43] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717#9720501 (10Scott_French) p:05Triage→03Low I have a proof-of-concept patch for this, the result of which is shown in P60688 (using train-dev). This front-... [21:56:13] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717#9720685 (10RLazarus) If we're really worried about that race condition, is it plausible to do this? - Run `helmfile diff` early, right after getting the scap...