[07:53:50] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10098761 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by jayme@cumin1002 from kubernetes200... [07:54:46] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10098762 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by jayme@cumin1002 for host wiki... [08:46:10] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10098906 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jayme@cumin1002 for host wikikube... [08:53:24] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373505 (10JMeybohm) 03NEW [09:24:19] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373505#10099044 (10Clement_Goubert) →14Duplicate dup:03T373457 [09:24:20] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373491#10099045 (10Clement_Goubert) →14Duplicate dup:03T373457 [09:24:37] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373457#10099040 (10Clement_Goubert) [09:28:41] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373457#10099072 (10Clement_Goubert) [09:33:23] 06serviceops, 10Shellbox, 10Wikibase-Quality-Constraints, 10Wikidata, and 4 others: [SW] [WBQC] shellbox-constraints returning 500 on preg_match error - https://phabricator.wikimedia.org/T362084#10099077 (10ItamarWMDE) @Lucas_Werkmeister_WMDE We were looking at this in story writing, and we have a few ques... [12:45:35] 06serviceops, 06Content-Transform-Team-WIP, 10RESTBase Sunsetting, 07Essential-Work, 13Patch-For-Review: Failed mobileapps deployment - https://phabricator.wikimedia.org/T373314#10099508 (10Jgiannelos) 05Open→03Resolved [12:54:14] fyi: https://fosstodon.org/@krinkle/113037810411619348 - the video I uploaded yesterday had nearly all its transcodes fail with shellbox errors - they're now requeued but just wanted to give a heads up in case it was a larger issue [12:54:17] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 07Video: shellbox-video pods being restarted prematurely - https://phabricator.wikimedia.org/T373517 (10hnowlan) 03NEW [12:58:16] legoktm:thanks for the heads-up, perfect timing for the above :) [12:59:01] ooh, yw! [13:00:43] could you give me a link to the video itself for debugging purposes please? [13:03:22] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 07Video: shellbox-video pods being restarted prematurely - https://phabricator.wikimedia.org/T373517#10099571 (10hnowlan) As a stopgap measure we've already [[ https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1067963 | introduced retries to the... [13:03:51] hnowlan: https://commons.wikimedia.org/wiki/File:Grace_Hopper_-_Future_Possibilities_-_Data,_Hardware,_Software,_and_People.webm it's ~1h30m [13:04:19] thanks [13:08:30] Tài khoản Dfyn của mình bị EPIC khóa toàn cầu nhưng tài khoản đó không có sửa đổi gây hại hay bất kỳ sửa đổi giống rối nào? Giờ mình muốn mở khóa cho tài khoản đó thì giờ mình phải làm sao hả bạn?? [13:08:45] [Translation] My Dfyn account has been globally locked by EPIC, but that account has not made any harmful or disruptive edits. How can I go about unlocking this account? [13:09:56] "Dfyn account" = ''user:Dfyn'' [13:10:16] help! [13:11:55] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: shellbox-video pods being restarted prematurely - https://phabricator.wikimedia.org/T373517#10099610 (10hnowlan) [13:12:25] hnowlan: o/ thumbor deployed to staging! When you have a moment lemme know if it looks working, and if it is ok in your option to proceed with prod [13:20:17] elukey: lgtm [13:23:20] hnowlan: so green light for prod? [13:27:09] yep! [13:29:56] all right proceeding :) [13:36:40] 06serviceops, 06Content-Transform-Team, 10WMDE-TechWish-Maintenance, 07Epic, and 2 others: Move Kartotherian to Kubernetes - https://phabricator.wikimedia.org/T216826#10099670 (10elukey) Hi @MSantos! Ack thanks for the info, will keep it in mind when hopefully upgrading the maps nodes :) Any news about the... [13:39:50] 06serviceops, 10MoveComms-Support, 07Datacenter-Switchover: MoveComms support for Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T371130#10099693 (10Trizek-WMF) [14:11:11] 06serviceops, 06Infrastructure-Foundations, 07Security: Migrate the ownership of Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373526 (10elukey) 03NEW [14:17:44] 06serviceops, 06Data-Platform-SRE, 06Infrastructure-Foundations, 06Machine-Learning-Team, 07Security: Migrate the ownership of Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373526#10099799 (10elukey) [14:25:32] 06serviceops, 10docker-pkg, 06Release-Engineering-Team: Attach opencontainers image metadata to docker images - https://phabricator.wikimedia.org/T345070#10099843 (10elukey) Created T371549 to move the `Maintainer` field of the production-images repo's control files to team-specific (where possible). Ideally... [14:27:00] 06serviceops, 06Content-Transform-Team-WIP, 10iOS-app-feature-Performance, 10RESTBase, and 6 others: PCS caching and pregeneration when restbase is decommissioned - https://phabricator.wikimedia.org/T319365#10099849 (10Jgiannelos) Reverted after high error rate: https://gerrit.wikimedia.org/r/c/operations/... [14:28:56] 06serviceops, 06Content-Transform-Team-WIP, 10iOS-app-feature-Performance, 10RESTBase, and 6 others: PCS caching and pregeneration when restbase is decommissioned - https://phabricator.wikimedia.org/T319365#10099852 (10Jgiannelos) There must be something wrong in the cassandra config and authentication: `... [14:32:04] I tried to enable cassandra connections on PCS prod but it caused high error rate because of authentication issues: https://phabricator.wikimedia.org/T319365#10099852 [14:33:55] 06serviceops, 06Content-Transform-Team-WIP, 06Data-Persistence, 10iOS-app-feature-Performance, and 7 others: PCS caching and pregeneration when restbase is decommissioned - https://phabricator.wikimedia.org/T319365#10099858 (10Jgiannelos) [14:40:41] nemo-yiannis: o/ I think that the password is wrong, I checked on a mobileapps' config.yaml pod and it looks not correct (sending the value in pvt) [14:41:27] maybe we are missing a private change that configures the right password? [14:55:18] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10099911 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host w... [14:57:38] (followed up in pvt, the pass was missing from the private config) [14:59:23] thanks elukey, redeploying now [15:05:11] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: shellbox-video pods being restarted prematurely - https://phabricator.wikimedia.org/T373517#10099957 (10hnowlan) p:05Triage→03High [15:20:34] 06serviceops, 06Data-Platform-SRE, 06Infrastructure-Foundations, 06Machine-Learning-Team: Migrate the ownership of Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373526#10100057 (10elukey) [15:31:48] 06serviceops, 06Data-Platform-SRE, 06Infrastructure-Foundations, 06Machine-Learning-Team, 07Security: Migrate the ownership of DPE-Owned Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373534 (10bking) 03NEW [15:40:51] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100199 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikik... [15:42:35] Is there any way i can verify that the traffic originating from RESTBase to mobileapps set the RESTBase/WMF user agent? [15:43:40] We (should) serve uncached responses to RESTBase only (by checking the user agent) but from a quick grep on the tls proxy logs i don't see any references which sounds suspicious [15:56:32] I just verified it doesn't :/ we need to find another way to filter out the restbase traffic on pcs [16:04:39] 06serviceops, 06Content-Transform-Team-WIP, 06Data-Persistence, 10iOS-app-feature-Performance, and 7 others: PCS caching and pregeneration when restbase is decommissioned - https://phabricator.wikimedia.org/T319365#10100276 (10Jgiannelos) Yet another revert after: https://gerrit.wikimedia.org/r/c/operation... [16:05:01] 06serviceops, 13Patch-For-Review: Prepare PHP 8.1 production images - https://phabricator.wikimedia.org/T372602#10100278 (10Scott_French) The 8.1-based production images are ready to go and seem to work per some basic local smoke tests. I'm going to hold off on merging and building for the moment, until I'm a... [16:07:14] nemo-yiannis: what agent does it use? [16:07:32] it doesn't set it [16:08:05] or better its not very consistent and when it comes to PCS requests it doesn't [16:09:13] we should either patch restbase to enforce the user agent in the PCS requests or find another way to detect restbase traffic [16:14:17] setting the agent correctly seems like a good option, if that's feasible [16:14:20] hnowlan: when we route traffic from rest gateway to pcs (when we switchover) can we inject a header so instead of filtering out restbase requests to enforce cache in the requests originating from rest-gateway ? [16:15:24] either way, setting the right user agent in restbase wouldn't hurt [16:15:40] we could yeah. What do you mean instead of filtering out restbase request? [16:19:12] 06serviceops, 10MoveComms-Support, 07Datacenter-Switchover: MoveComms support for Southward Datacenter Switchover (September 2024) - https://phabricator.wikimedia.org/T371130#10100324 (10Trizek-WMF) p:05Triage→03Medium a:03Trizek-WMF [16:36:11] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100393 (10ops-monitoring-bot) Cookbook cookbooks.sre.k8s.pool-depool-node started by cgoubert@cumin1002 pool fo... [16:41:08] hnowlan: Instead of PCS filtering out RESTBase requests from cached content. [16:41:28] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T373457#10100422 (10Jhancock.wm) 05Open→03Resolved a:03Jhancock.wm [16:41:32] From a quick look its very easy to add the user agents either way. I will send a patch tomorrow. [16:41:54] 06serviceops, 06Release-Engineering-Team: scap fails to create /srv/mediawiki-staging/php-1.43.0-wmf.20/cache/l10n - https://phabricator.wikimedia.org/T373425#10100427 (10dancy) 05Resolved→03Open p:05Unbreak!→03Medium [17:18:56] 06serviceops, 06Release-Engineering-Team: scap fails to create /srv/mediawiki-staging/php-1.43.0-wmf.20/cache/l10n - https://phabricator.wikimedia.org/T373425#10100604 (10thcipriani) [17:19:08] 06serviceops, 06Release-Engineering-Team: Investigate scap fails to create /srv/mediawiki-staging/php-1.43.0-wmf.20/cache/l10n - https://phabricator.wikimedia.org/T373425#10100606 (10thcipriani) [17:21:05] 06serviceops, 06Release-Engineering-Team: Investigate scap fails to create /srv/mediawiki-staging/php-1.43.0-wmf.20/cache/l10n - https://phabricator.wikimedia.org/T373425#10100611 (10thcipriani) Question for investigation: - What were the steps leading to this state? -- How did you check out the code for this... [17:24:57] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100621 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host... [17:29:59] 06serviceops, 06Data-Platform-SRE, 06Infrastructure-Foundations, 06Machine-Learning-Team, 13Patch-For-Review: Migrate the ownership of Docker images in production-images repo to mailing lists - https://phabricator.wikimedia.org/T373526#10100630 (10akosiaris) I see one problem with this approach. Teams ch... [18:04:39] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100785 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wiki... [18:10:55] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100821 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by akosiaris@cumin1002 from mw2294 to... [18:15:14] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100846 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host... [18:59:58] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10100972 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host wiki... [21:21:44] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10101297 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by swfrench@cumin2002 from kubernetes... [21:23:44] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10101298 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by swfrench@cumin2002 for host w... [22:11:53] 06serviceops, 06Infrastructure-Foundations, 10netops, 06SRE, 13Patch-For-Review: Re-IP wikikube servers in codfw row A/B moving to per-rack subnets - https://phabricator.wikimedia.org/T372878#10101358 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by swfrench@cumin2002 for host wikik...