[01:37:32] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T370962#10178173 (10Scott_French) Cross-posting / paraphrasing from IRC: This afternoon, we received some outbound port utilization alerts for cr2-codfw, due to mod... [02:13:39] 06serviceops, 10MW-on-K8s: MW image version for maintenance scripts - https://phabricator.wikimedia.org/T359127#10178182 (10RLazarus) [02:14:41] 06serviceops, 10MW-on-K8s: MW image version for maintenance scripts - https://phabricator.wikimedia.org/T359127#10178179 (10RLazarus) 05Open→03Resolved The image version is now copied from the mw-web deployment. [02:37:17] 06serviceops, 10MW-on-K8s: Allow cleaning up specific mwscript-k8s runs - https://phabricator.wikimedia.org/T369143#10178184 (10RLazarus) 05Open→03Declined Thanks! In general you shouldn't need to do this, even if the job was a mistake. Kubernetes cleans up the job automatically a week after it termina... [03:10:53] 06serviceops, 10MW-on-K8s: Show more useful information when mwscript-k8s fails to launch - https://phabricator.wikimedia.org/T369142#10178194 (10RLazarus) a:03RLazarus [07:21:07] 06serviceops, 13Patch-For-Review: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10178314 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1002 for host registry1004.eqiad.wmnet with OS bookworm [07:32:46] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Rename wikikube worker nodes during OS reimage - https://phabricator.wikimedia.org/T365571#10178340 (10JMeybohm) [08:10:12] 06serviceops, 13Patch-For-Review: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10178428 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1002 for host registry1004.eqiad.wmnet with OS bookworm completed: - registry1004 (**WARN**) - Do... [08:49:13] 06serviceops, 13Patch-For-Review: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10178543 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by elukey@cumin1002 for host registry2004.codfw.wmnet with OS bookworm [09:28:00] 06serviceops, 13Patch-For-Review: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10178636 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by elukey@cumin1002 for host registry2004.codfw.wmnet with OS bookworm completed: - registry2004 (**WARN**) - Do... [09:38:50] hey folks, I noticed alerts for the TLS certs parsoid.svc.codfw.wmnet and parsoid.svc.eqiad.wmnet [09:39:09] pretty sure those are old ones that just need to be purged/destroyed in the puppet ca, just wanted to double check [09:41:20] 06serviceops: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10178715 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by elukey@cumin1002 for hosts: `registry1003.eqiad.wmnet` - registry1003.eqiad.wmnet (**PASS**) - Downtimed host on Icinga/Alertmanager... [09:51:57] 06serviceops: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10178741 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by elukey@cumin1002 for hosts: `registry2003.codfw.wmnet` - registry2003.codfw.wmnet (**PASS**) - Downtimed host on Icinga/Alertmanager... [10:27:14] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#10178834 (10MoritzMuehlenhoff) [10:29:07] 06serviceops, 10LPL Essential, 10MinT, 10Community Wishlist (Translations), 10Community-Tech (Jackal (not a fox) Fox (Sept 23 - Oct 4)): Caching service request for MinT - https://phabricator.wikimedia.org/T370755#10178842 (10jijiki) @santhosh given the low traffic and the low storage needs, we could sta... [11:50:44] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T370962#10178985 (10jcrespo) [12:06:23] 06serviceops, 10LPL Essential, 10MinT, 10Community Wishlist (Translations), 10Community-Tech (Jackal (not a fox) Fox (Sept 23 - Oct 4)): Caching service request for MinT - https://phabricator.wikimedia.org/T370755#10179050 (10Pginer-WMF) >>! In T370755#10178842, @jijiki wrote: > @santhosh given the low t... [12:07:10] 06serviceops: Migrate docker registry hosts to bookworm - https://phabricator.wikimedia.org/T332016#10179054 (10elukey) 05Open→03Resolved a:03elukey [12:09:27] 06serviceops, 10LPL Essential, 10MinT, 10Community Wishlist (Translations), 10Community-Tech (Jackal (not a fox) Fox (Sept 23 - Oct 4)): Caching service request for MinT - https://phabricator.wikimedia.org/T370755#10179058 (10akosiaris) >>! In T370755#10179050, @Pginer-WMF wrote: >>>! In T370755#10178842... [12:14:46] 06serviceops, 10[DEPRECATED] wdwb-tech, 10Citoid, 06Content-Transform-Team, and 9 others: Migrate node-based services in production to node18 - https://phabricator.wikimedia.org/T349118#10179095 (10Mvolz) [13:25:19] 06serviceops, 10Citoid: citoid having stability issues - https://phabricator.wikimedia.org/T330768#10179457 (10akosiaris) >>! In T330768#10173616, @Jdforrester-WMF wrote: > Whilst poking around open alerts, I noticed that both [[https://alerts.wikimedia.org/?q=%40state%3Dactive&q=namespace%3Dcitoid|citoid]] an... [14:02:20] 06serviceops, 10MW-on-K8s, 10wikitech.wikimedia.org: Communication for Wikitech/Wikimedia Developer Account migration - https://phabricator.wikimedia.org/T373615#10179685 (10joanna_borun) [14:11:19] 06serviceops, 06Data Products, 07Epic: SDS 2.1.1 Evaluations of 3rd part Experimentation Platform by SRE Service Ops - https://phabricator.wikimedia.org/T369174#10179732 (10WDoranWMF) @Legoktm Really sorry, I only just saw this - so many phab notifications. Let me check, @VirginiaPoundstone and @odimitrije... [14:49:54] 06serviceops, 06Data Products, 06Data-Platform-SRE, 10Dumps-Generation, and 2 others: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#10179932 (10dr0ptp4kt) @Joe checking - is Q2 FY 24-25 still looking good for Service Ops containerization w... [18:03:10] 06serviceops, 07Datacenter-Switchover, 13Patch-For-Review: Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T370962#10180876 (10Scott_French) Quick explanation of T370962#10180675 (stale scap dsh groups): When running a test `scap sync-world` after switching the active de...