[00:05:37] 06serviceops, 06Release-Engineering-Team, 10Scap: scap should optionally display helmfile diffs for review - https://phabricator.wikimedia.org/T362717#9724858 (10Scott_French) Thanks, Reuven. Once that's available, it would be good to figure out whether it's straightforward to make the regex precise enough t... [01:05:07] 06serviceops: Migrate etcd::tlsproxy Nginx certs and etcd itself to PKI - https://phabricator.wikimedia.org/T352245#9724964 (10Scott_French) > Is there any value in creating a new intermediate, separate from "etcd" used by clusters supporting k8s? The main benefit here would be decoupling between rather differe... [09:40:33] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9725610 (10MoritzMuehlenhoff) [09:45:57] hi folks, I'm trying to build a new version of images/jaeger/builder with bookworm, and getting this error [09:46:00] 2024-04-18 09:04:26,531 [docker-pkg-build] INFO - W: Failed to fetch http://security.debian.org/debian-security/dists/bookworm-security/InRelease Could not co [09:46:03] nnect to security.debian.org:80 (151.101.130.132), connection timed out Could not connect to security.debian.org:80 (151.101.66.132), connection timed out Coul [09:46:06] d not connect to security.debian.org:80 (151.101.2.132), connection timed out Could not connect to security.debian.org:80 (151.101.194.132), connection timed o [09:46:09] ut [09:46:11] gah, sorry for the spam, anyways the proxy is set via set_proxy [09:46:22] as per https://wikitech.wikimedia.org/wiki/Kubernetes/Images#Base_images [09:46:46] and I'm building on build2001 with docker-pkg build images/jaeger/builder/ (running with my user) did you run into this before ? [10:10:55] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9725707 (10MoritzMuehlenhoff) [10:10:58] godog: sudo -i and then set_proxy perhaps? [10:12:32] akosiaris: thank you, trying now [10:13:25] nope :( [10:13:26] 2024-04-18 10:12:40,481 [docker-pkg-build] INFO - W: Failed to fetch http://security.debian.org/debian-security/dists/bookworm-security/InRelease Could [10:13:41] I'm sure I'm holding it wrong, otherwise I can't explain how the weekly rebuild works [10:14:54] it's calling /srv/deployment/docker-pkg/venv/bin/docker-pkg -c /etc/production-images/config.yaml build images, right ? [10:14:59] I tend to use /usr/local/bin/build-production-images for this [10:17:36] mmhh what I'm doing is 'docker-pkg build images/jaeger/builder' from my home's production-images checkout, I'm testing a change before sending a code review [10:18:27] now I'm doubting that's what I'm supposed to do heh [10:20:31] off to lunch, will try again later [10:37:59] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725777 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw2302.codfw.wmnet with OS bullseye [10:38:22] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725778 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw2303.codfw.wmnet with OS bullseye [10:38:54] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725779 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw2304.codfw.wmnet with OS bullseye [10:39:10] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725780 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw2332.codfw.wmnet with OS bullseye [10:39:36] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725783 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw2333.codfw.wmnet with OS bullseye [10:40:07] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725786 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw2334.codfw.wmnet with OS bullseye [11:16:03] 06serviceops, 10MW-on-K8s, 06Release-Engineering-Team, 10Scap: Find a way to stage updated PHP packages on wikikube - https://phabricator.wikimedia.org/T362628#9725897 (10MoritzMuehlenhoff) >>! In T362628#9722794, @akosiaris wrote: >> * If a new image found to be okay, have some script/option/tool to promo... [11:16:24] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725898 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw2303.codfw.wmnet with OS bullseye completed: - mw23... [11:20:21] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725904 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw2332.codfw.wmnet with OS bullseye completed: - mw23... [11:23:29] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725922 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw2334.codfw.wmnet with OS bullseye completed: - mw23... [11:25:16] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725930 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw2304.codfw.wmnet with OS bullseye completed: - mw23... [11:30:30] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9725953 (10Ottomata) > search index not getting updated in 0.001% of edits Search is probabl... [11:30:44] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725971 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw2302.codfw.wmnet with OS bullseye completed: - mw23... [11:35:47] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9725988 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw2333.codfw.wmnet with OS bullseye completed: - mw23... [12:17:11] 06serviceops, 10Sustainability (Incident Followup): 2024-04-17 mw-* went down in eqiad - https://phabricator.wikimedia.org/T362766#9726077 (10JMeybohm) [12:18:25] akosiaris: ok I got I think, I can confirm that proxy env variables are not honored, what is honored is http_proxy in config.yaml, which is in /etc/production-images/config.yaml and not in config.yaml in production-images.git for good reason. then what tripped me up is that docker-pkg v4 supports ~/.config/docker-pkg.yaml though that's not deployed on build2001 [12:18:49] so what I did is temporarily change the config.yaml from production-images I'm using to add http_proxy [12:23:14] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9726100 (10Ladsgroup) >>! In T249745#9725953, @Ottomata wrote: >> search index not getting up... [12:29:08] godog: ah, glad you sorted it out. [12:31:36] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9726125 (10MoritzMuehlenhoff) [12:31:59] akosiaris: indeed, I'll yank 'set_proxy' from https://wikitech.wikimedia.org/wiki/Kubernetes/Images#Base_images does that make sense ? [12:35:49] godog: that's Base images, not production images [12:36:11] you were building a production image, base images don't use docker-pkg but debuerreotype [12:36:13] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9726141 (10Ottomata) Replied at T120242#9726131 [12:36:38] akosiaris: doh of course, nevermind [13:50:20] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9726525 (10MoritzMuehlenhoff) [14:21:21] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726682 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw1355.eqiad.wmnet with OS bullseye [14:21:54] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726687 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw1480.eqiad.wmnet with OS bullseye [14:22:28] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726688 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw1481.eqiad.wmnet with OS bullseye [14:23:08] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726690 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host mw1487.eqiad.wmnet with OS bullseye [14:46:35] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726780 (10Clement_Goubert) I abandoned the CR to move more eqiad api_appservers because it would leave only 15, 5 of them canaries. We still have a bit more ma... [14:56:06] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726816 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw1481.eqiad.wmnet with OS bullseye completed: - mw14... [14:58:27] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726818 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw1487.eqiad.wmnet with OS bullseye completed: - mw14... [15:02:23] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726825 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw1355.eqiad.wmnet with OS bullseye completed: - mw13... [15:03:58] 06serviceops, 10MW-on-K8s, 13Patch-For-Review: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9726829 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host mw1480.eqiad.wmnet with OS bullseye completed: - mw14... [15:37:19] 06serviceops, 06Data-Engineering, 06MediaWiki-Engineering, 10WMF-JobQueue, and 2 others: Could not enqueue jobs: "Unable to deliver all events: 503: Service Unavailable" - https://phabricator.wikimedia.org/T249745#9726999 (10akosiaris) >>! In T249745#9723704, @Ottomata wrote: >>> For replicating state chan... [15:43:41] 06serviceops: Provide nodejs20 base images for production - https://phabricator.wikimedia.org/T362681#9727028 (10akosiaris) nodejs20 isn't even on trixie/sid right now https://packages.debian.org/trixie/nodejs, https://packages.debian.org/sid/nodejs but only in experimental. I am adding @MoritzMuehlenhoff to ad... [18:27:47] 06serviceops, 10ChangeProp, 06collaboration-services, 06Infrastructure-Foundations, and 10 others: Figure out a plan to move forward with regarding Redis License changes - https://phabricator.wikimedia.org/T360596#9727771 (10brennen) [19:29:31] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9728020 (10dancy) [19:35:43] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team, 13Patch-For-Review: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9728031 (10dancy) Nothing that I'm blocked on scap development/testing in train-dev due to `docker-registry.wikimedia.org/mediawiki-httpd:latest... [23:04:50] 06serviceops, 10MW-on-K8s, 10MediaWiki-Platform-Team (Radar): Allow php-fpm to read environment variables from the system, not just from the fcgi request - https://phabricator.wikimedia.org/T326705#9728394 (10Krinkle) [23:33:56] 06serviceops, 10ops-codfw, 06SRE: Degraded RAID on mw2382 - https://phabricator.wikimedia.org/T362938#9728413 (10Dzahn) [23:36:01] 06serviceops, 10ops-codfw, 06SRE: Degraded RAID on mw2382 - https://phabricator.wikimedia.org/T362938#9728418 (10Dzahn) ` mw2382 is kubernetes::worker mw2382 is a Kubernetes worker node (kubernetes::worker) Bare Metal host on site codfw and rack A3 `