[06:18:49] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9721307 (10hashar) >>! In T362518#9719270, @Jdforrester-WMF wrote: > This has also broken building CI images. Will have to migrate them to bulls... [07:01:16] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9721419 (10MoritzMuehlenhoff) >>! In T362518#9719270, @Jdforrester-WMF wrote: > This has also broken building CI images. Will have to migrate th... [08:46:40] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9721680 (10MoritzMuehlenhoff) [10:04:37] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9721866 (10MoritzMuehlenhoff) [11:22:59] 06serviceops, 10Sustainability (Incident Followup): 2024-04-17 mw-* went down in eqiad - https://phabricator.wikimedia.org/T362766 (10jcrespo) 03NEW [11:33:10] 06serviceops, 10Sustainability (Incident Followup): 2024-04-17 mw-* went down in eqiad - https://phabricator.wikimedia.org/T362766#9722127 (10JMeybohm) coredns related changes https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1020778 https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+... [11:34:01] 06serviceops, 10Sustainability (Incident Followup): 2024-04-17 mw-* went down in eqiad - https://phabricator.wikimedia.org/T362766#9722139 (10Clement_Goubert) The change was rolled back in eqiad, and eqiad was repooled around 10:45. A terminating dot was added to the DNS name in codfw to avoid a recursive requ... [11:52:27] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9722192 (10MoritzMuehlenhoff) [13:14:12] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9722373 (10Jdforrester-WMF) >>! In T362518#9721307, @hashar wrote: >>>! In T362518#9719270, @Jdforrester-WMF wrote: >> This has also broken buil... [13:44:16] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9722479 (10MoritzMuehlenhoff) [13:53:11] 06serviceops, 10Sustainability (Incident Followup): 2024-04-17 mw-* went down in eqiad - https://phabricator.wikimedia.org/T362766#9722521 (10Clement_Goubert) As an aside, and contributing to the time to recovery, we observed the apache container getting oomkilled, we strongly suppose because of the backpressu... [14:02:59] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team, 10Scap: Find a way to stage updated PHP packages on wikikube - https://phabricator.wikimedia.org/T362628#9722547 (10MoritzMuehlenhoff) This isn't just limited to updating PHP, but also extends to the full OS stack underneath (libs used by PHP etc). Whe... [14:13:54] 06serviceops: Package latest version of prometheus-memcached-exporter (v0.14.2) - https://phabricator.wikimedia.org/T350807#9722594 (10Andrew) 05Resolved→03Open [14:13:58] 06serviceops: Package latest version of prometheus-memcached-exporter (v0.14.2) - https://phabricator.wikimedia.org/T350807#9722593 (10Andrew) Howdy! Coincidentally, I just did a dist-upgrade that pulled in this new package. The 0.14 package installs its binary here: ` /usr/bin/memcached_exporter ` Whereas t... [14:44:20] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9722701 (10MoritzMuehlenhoff) [15:07:06] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team, 10Scap: Find a way to stage updated PHP packages on wikikube - https://phabricator.wikimedia.org/T362628#9722794 (10akosiaris) > * We have staging base images and staging service images updated daily based on what is in Debian and apt.wikimedia.org In... [15:17:22] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 12), 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9722869 (10WDoranWMF) [15:21:07] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9722942 (10Ottomata) > see the CAP theorem C != eventual-C. Eventual Consistency + AP is fea... [15:47:19] 06serviceops, 06Machine-Learning-Team, 10MW-on-K8s, 06SRE, 13Patch-For-Review: Migrate ml-services to mw-api-int - https://phabricator.wikimedia.org/T362316#9723080 (10elukey) Added some thoughts to T353622#9723070, I found out a big can of worms while testing staging :) The upgrade is more complex than... [15:54:15] 06serviceops, 10Data Products (Data Products Sprint 12): Service Ops Review of Metrics Platform Configuration Management UI - https://phabricator.wikimedia.org/T358577#9723273 (10VirginiaPoundstone) [15:55:10] claime: the move to mw-api-int-ro unveiled a lot of istio horrors, but I have a path forward, hopefully with few days of work we'll migrate [15:58:42] elukey: I'm sorry <3 [16:01:32] claime: nono I learned new things that always bugged me, and now things make more sense. The only thing that always destroy me is that there are pitfalls and bugs (like the one mentioned) completing messing up the cards on the table [16:02:00] anyway, service mesh are great! [16:04:57] elukey: so basically the existing working behaviour is a bug? [16:05:02] iiuc [16:05:30] to the best of my understanding, yes [16:05:53] I always had an explanation for that corner case but I opened the other task to investigate why [16:06:16] now it makes more sense, but it was sneaky to find [16:06:26] no kidding [16:06:53] if you want to laugh the bug before it that took time to figure out was https://phabricator.wikimedia.org/T353622#9415171 [16:07:59] $deity... [16:08:34] * elukey nods [16:11:01] have a nice rest of the day folks! logging off [16:11:25] o/ [16:39:16] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9723515 (10akosiaris) >>! In T249745#9722942, @Ottomata wrote: >> see the CAP theorem > C !=... [16:46:50] 06serviceops, 13Patch-For-Review: etcdmirror does not recover from a cleared waitIndex - https://phabricator.wikimedia.org/T358636#9723569 (10Scott_French) After asking around a bit, it seems that temporarily directing codfw-associated clients (confd, navtiming, pybal) to eqiad should be relatively straightfor... [17:31:28] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9723704 (10Ottomata) >> For replicating state changes (T120242) [...] > Why though? Why is 99... [20:06:22] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9724249 (10Ladsgroup) >>! In T249745#9723704, @Ottomata wrote: >>> For replicating state chan... [21:40:05] 06serviceops, 06Growth-Team, 10Growth-Team-Filtering, 10MW-on-K8s, 10Notifications: Broken (empty) cross-wiki notification when using $wgLocalHTTPProxy (e.g. on Kubernetes) - https://phabricator.wikimedia.org/T223413#9724514 (10Novem_Linguae) 05Resolved→03Open Got this again today. I recently turned... [21:41:47] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717#9724523 (10Scott_French) Thanks, Reuven! I think it should be feasible to do something like that, yes. From a quick glance, it looks like the helm-diff plugi... [22:37:57] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717#9724675 (10Scott_French) Hmmm ... I just realized that "ignoring" the image tag change (in the image-build case) when comparing the initial and pre-apply roun... [22:42:42] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717#9724679 (10RLazarus) That sounds reasonable! Note for the future that `helm diff` has a `--suppress-output-line-regex` which does exactly what you'd like it t...