[08:31:54] 06serviceops, 10conftool: Requestctl sync writes unchanged objects - https://phabricator.wikimedia.org/T375059#10159688 (10Joe) 05Open→03Resolved [10:04:16] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#10159959 (10MoritzMuehlenhoff) [10:11:01] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 4 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#10159988 (10fgiunchedi) [10:17:53] hey folks [10:18:12] going to deploy https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1073802 to remove the old poolcounters from the mw net policies [10:59:05] Can I ask if conf2006 can survive a brief interruption to network? [10:59:24] It's in codfw rack D8 we're working in later (T373105) [11:03:23] topranks: let me check, it most likely shouldnt be a problem [11:04:51] ty [12:27:16] topranks: I will help you out, we will need to do a few restarts [12:27:45] effie: ok - thanks for helping out [12:27:55] let me know if I can be of any assistance :) [12:28:13] hehe cheers [14:16:40] 06serviceops: wikikube-worker1001 failed to docker pull on two consecutive deployments - https://phabricator.wikimedia.org/T375201#10160989 (10taavi) this probably has something to do with it? /cc @akosiaris ` The last Puppet run was at Mon Sep 16 09:50:38 UTC 2024 (4583 minutes ago). Puppet is disabled. alex fo... [14:19:30] 06serviceops, 06Infrastructure-Foundations, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Race condition in iptables rules during puppet runs on k8s nodes - https://phabricator.wikimedia.org/T374366#10161004 (10JMeybohm) Fixing ferm_status.py is still not enough. When puppet corrects an on disk fe... [14:24:02] 06serviceops: wikikube-worker1001 failed to docker pull on two consecutive deployments - https://phabricator.wikimedia.org/T375201#10161048 (10JMeybohm) 05Open→03Resolved a:03JMeybohm I think I know what this was about...re-enabled and ran puppet which should have fixed firewall rules for SSH from depl... [14:26:00] 06serviceops: wikikube-worker1001 failed to docker pull on two consecutive deployments - https://phabricator.wikimedia.org/T375201#10161075 (10Lucas_Werkmeister_WMDE) Thanks for unfooing the server ;P [14:32:35] Also relating to T373105 is anyone available to depool the k8s hosts for today? [14:36:34] 06serviceops: wikikube-worker1001 failed to docker pull on two consecutive deployments - https://phabricator.wikimedia.org/T375201#10161156 (10akosiaris) Thanks for fixing it @JMeybohm. For posterity's sake, the "fooing" part was related to T374366 and trying to figure out the race condition(s). [14:43:11] topranks: yeah, I can do that [14:44:01] swfrench-wmf: awesome [14:44:06] not too many today thankfully [14:44:31] yeah, today looks like an easy one :) [15:09:50] hnowlan: o/ do you think it is ok for me to deploy thumbor to update the network policies? (removing the old poolcounter ips) [15:17:23] elukey: sounds good to me [15:28:08] 06serviceops, 10Deployments, 06Release-Engineering-Team: sync-testservers-k8s takes 4 minutes when deploying a mediawiki-config change - https://phabricator.wikimedia.org/T374907#10161530 (10akosiaris) >>! In T374907#10153807, @akosiaris wrote: >>>! In T374907#10153052, @hashar wrote: > >> For the OCI image... [15:43:16] topranks: k8s nodes are depooled [15:43:58] 06serviceops, 10Deployments, 06Release-Engineering-Team: sync-testservers-k8s takes 4 minutes when deploying a mediawiki-config change - https://phabricator.wikimedia.org/T374907#10161586 (10hashar) Awesome thank you @akosiaris , given the step is removed, I guess there is no need to investigate why it took... [15:46:17] 06serviceops: Migrate poolcounter hosts to bookworm - https://phabricator.wikimedia.org/T332015#10161591 (10elukey) Last step remaining is to decommission the old VMs! [15:54:17] swfrench-wmf: thank you :) [15:55:16] topranks: are you starting this before 16:00Z ? [15:55:43] effie: no not before, and there is no massive rush after [15:55:57] so take your time if you have more stuff to do - just let me know when it's ok to proceed [16:05:59] topranks: we have actionables after you are done, lets move the convo to -sre, and ping the traffic folks too as we will be restarting a few things on their turf [16:06:17] effie: ok let's do that [16:38:58] k8s nodes are repooled [16:50:13] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: shellbox-video pods being restarted prematurely - https://phabricator.wikimedia.org/T373517#10161968 (10hnowlan) Just to note, I've been testing by forcing a reencode of [[ https://test.wikipedia.org/wiki/File:CC_1916_10_02_TheP... [17:50:29] 06serviceops, 10MW-on-K8s, 10Scap, 13Patch-For-Review: Evaluate the performance improvements brought in by prefetching MW images on WikiKube hosts - https://phabricator.wikimedia.org/T366778#10162246 (10dancy) Deployed with scap 4.104.0 [19:35:02] 06serviceops, 07Datacenter-Switchover: Verify our current wikikube capacity (in both DCs) can handle all our traffic - https://phabricator.wikimedia.org/T371273#10162512 (10Scott_French) Alright, well that was pleasantly uneventful. Cutting out the periods while traffic was shifting, the deployments in eqiad... [22:00:26] 06serviceops: Turn up PHP 8.1 Shellbox deployments - https://phabricator.wikimedia.org/T375243 (10Scott_French) 03NEW