[07:45:52] 06serviceops, 13Patch-For-Review, 07Wikimedia-production-error: registry2004 sometimes reporting: too many open files problems - https://phabricator.wikimedia.org/T366481#9912644 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert Since the bump to `10240` open files resulted in one last spik... [08:07:23] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 15), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9912675 (10SGupta-WMF) @Scott_French Here is the tag on main branch https://gi... [08:13:20] 06serviceops, 06Content-Transform-Team-WIP, 10RESTBase, 10RESTBase Sunsetting, and 2 others: Allow connections from PCS to eventgate - https://phabricator.wikimedia.org/T368052#9912689 (10akosiaris) Patch merged and deployed. [08:42:00] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9912742 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-ctrl2002.co... [10:29:06] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9912975 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-ctrl2002.codfw.... [10:34:02] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9912990 (10kamila) 05Open→03Resolved Done, thanks a lot for the help @Papaul ! [12:22:16] 06serviceops, 06Infrastructure-Foundations, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 13Patch-For-Review: Create a helm chart for the cloudnativepg postgresql operator - https://phabricator.wikimedia.org/T364797#9913200 (10akosiaris) [12:54:05] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9913305 (10Papaul) @kamila anytime [13:19:29] 06serviceops, 10Wikidata, 10wmde-wikidata-tech, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9913394 (10Gehel) Closing as the decision is made.... [13:20:15] 06serviceops, 10Wikidata, 10wmde-wikidata-tech, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9913384 (10Gehel) 05Open→03Resolved a:03Gehel [13:26:32] 06serviceops, 10Wikidata, 10wmde-wikidata-tech, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9913417 (10Gehel) 05Resolved→03Open [13:51:38] 06serviceops, 10Prod-Kubernetes, 13Patch-For-Review: Update all helm modules and charts to be compatible with the restricted PSS - https://phabricator.wikimedia.org/T362978#9913566 (10JMeybohm) [13:51:56] 06serviceops, 10Prod-Kubernetes, 13Patch-For-Review: PodSecurityPolicies will be deprecated with Kubernetes 1.21 - https://phabricator.wikimedia.org/T273507#9913567 (10JMeybohm) [14:01:30] hey folks! Qq - is it fine to restart the dragonfly supernodes one at the time? [14:01:36] not now, next week is fine [14:02:07] I was wondering about the procedure, so far the only thing that I can think of is to preferrably not do it during mw deployments [14:17:51] elukey: yeah, that [14:18:18] mw-deployments will probably survive as well but will def. take longer [14:31:56] super :) [14:40:46] <_joe_> elukey: we can do it in the mw infra window on monday [14:41:05] <_joe_> one of the mw infra windows [14:41:22] <_joe_> which are basically "SRE wants to do stuff that might interfere with deploying mw on k8s" [14:43:47] ack makes sense [17:23:01] 06serviceops, 06DC-Ops, 10ops-codfw, 06Traffic: lvs2011 Memory failure on slot B1 - https://phabricator.wikimedia.org/T368165 (10BCornwall) 03NEW [17:23:40] 06serviceops, 06DC-Ops, 10ops-codfw, 06Traffic: lvs2011 Memory failure on slot B1 - https://phabricator.wikimedia.org/T368165#9914057 (10BCornwall) p:05Triage→03High [17:24:44] 06serviceops, 06DC-Ops, 10ops-codfw, 06Traffic: lvs2011 Memory failure on slot B1 - https://phabricator.wikimedia.org/T368165#9914063 (10BCornwall) [20:38:00] 06serviceops, 10Cassandra: mediawiki: migrate from image-suggestion to data-gateway - https://phabricator.wikimedia.org/T368096#9914419 (10Eevans) [20:52:48] 06serviceops, 10Cassandra: mediawiki: migrate from image-suggestion to data-gateway - https://phabricator.wikimedia.org/T368096#9914454 (10Eevans) >>! In T368096#9911334, @Scott_French wrote: > @Eevans - Can you think of other blockers before mediawiki migrates? > > The one thing that comes to mind is "comple...