[08:16:26] 06serviceops, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2024.03.25 - 2024.04.14): Requesting permission to enable kafka log compaction for page_rerender on kafka-main - https://phabricator.wikimedia.org/T354794#9660514 (10brouberol) a:03brouberol [08:32:52] 06serviceops, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2024.03.25 - 2024.04.14): Requesting permission to enable kafka log compaction for page_rerender on kafka-main - https://phabricator.wikimedia.org/T354794#9660558 (10brouberol) ` brouberol@kafka-main2001:~$ kafka configs --entity-type t... [08:32:56] 06serviceops, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2024.03.25 - 2024.04.14): Requesting permission to enable kafka log compaction for page_rerender on kafka-main - https://phabricator.wikimedia.org/T354794#9660559 (10brouberol) 05Stalled→03In progress [08:40:38] 06serviceops, 10Deployments, 06Release-Engineering-Team: httpbb appserver test breaks deployment of the week due to a timeout parsing page - https://phabricator.wikimedia.org/T360867#9660563 (10hashar) [08:41:36] 06serviceops, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2024.03.25 - 2024.04.14): Requesting permission to enable kafka log compaction for page_rerender on kafka-main - https://phabricator.wikimedia.org/T354794#9660564 (10brouberol) ` brouberol@kafka-main2001:~$ kafka configs --entity-type t... [08:58:35] 06serviceops, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2024.03.25 - 2024.04.14): 14Requesting permission to enable kafka log compaction for page_rerender on kafka-main - 14https://phabricator.wikimedia.org/T354794#9660595 (10brouberol) 05In progress→03Resolved [08:59:35] 06serviceops, 10CirrusSearch, 06Discovery-Search, 10Data-Platform-SRE (2024.03.25 - 2024.04.14): 14Requesting permission to enable kafka log compaction for page_rerender on kafka-main - 14https://phabricator.wikimedia.org/T354794#9660594 (10brouberol) 14All done! It had quite the impact on `eqiad.medi... [09:01:05] brouberol: thanks! <3 [09:01:49] my pleasure! [10:23:31] 06serviceops, 06SRE, 07Epic: Phase out cergen for ServiceOps services - https://phabricator.wikimedia.org/T360636#9660756 (10jcrespo) [11:13:41] 06serviceops, 10RESTBase Sunsetting, 10API Platform (RESTbase Deprecation Roadmap), 07Epic: 14Survey RESTBase services and find which ones accesses Parsoid via RESTBase - 14https://phabricator.wikimedia.org/T333536#9660977 (10MSantos) 05Open→03Resolved a:03MSantos 14All listed services here are... [11:18:44] 06serviceops, 10Dumps-Generation, 06MediaWiki-Platform-Team: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432#9661011 (10Jdforrester-WMF) [11:20:51] 06serviceops, 06MediaWiki-Platform-Team, 07Epic: Migrate Wikimedia production from PHP 8.1 to PHP 8.3 - https://phabricator.wikimedia.org/T360995 (10Jdforrester-WMF) 03NEW [11:21:50] 06serviceops, 06MediaWiki-Platform-Team, 07Epic: Migrate Wikimedia production from PHP 8.1 to PHP 8.3 - https://phabricator.wikimedia.org/T360995#9661035 (10Jdforrester-WMF) [11:23:16] 06serviceops, 06MediaWiki-Platform-Team, 07Epic: Migrate Wikimedia production from PHP 8.1 to PHP 8.3 - https://phabricator.wikimedia.org/T360995#9661038 (10Jdforrester-WMF) 05Open→03Stalled Created for structural blocking; my understanding is that this won't start in earnest for several months. [11:23:25] 06serviceops, 10Dumps-Generation, 06MediaWiki-Platform-Team: Migrate WMF production from PHP 7.4 to PHP 8.1 - https://phabricator.wikimedia.org/T319432#9661029 (10Jdforrester-WMF) [13:32:30] 06serviceops, 10Prod-Kubernetes, 10Data-Platform-SRE (2024.03.25 - 2024.04.14), 07Kubernetes, 13Patch-For-Review: 14Add redis (rdb) instances to external-services - 14https://phabricator.wikimedia.org/T360612#9661492 (10JMeybohm) 05Open→03Resolved a:03JMeybohm [14:27:25] 06serviceops, 10Prod-Kubernetes, 10Data-Platform-SRE (2024.03.25 - 2024.04.14), 07Kubernetes, 13Patch-For-Review: Improve how we address outside k8s infrastructure from within charts (e.g. network policies) - https://phabricator.wikimedia.org/T331894#9661728 (10JMeybohm) Deployed v0.0.3 of the chart incl... [14:29:01] Amir1: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1014101 is really cool [14:30:37] my worry is that things might get exhausted too fast for circuit breaker to kick in. Fingers crossed. Worst case, we will measure and change [14:31:42] cdanis: This is now possible thanks to a full rewrite of LoadMonitor https://phabricator.wikimedia.org/T314020 (one of this Q's OKRs) which was in part was possible thanks to Tim's building of a simulation infrastructure where we can simulate all sorts of stuff [14:31:54] It's sooo next level awesome [14:32:13] (e.g. https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventSimulator/+/1013160) [14:34:26] (the LM rewrite in itself helps a lot with all sorts of failure scenarios) [14:45:00] Amir1: yeah I took a quick peek, it is very very cool [14:45:28] if you have ideas on how to make this better, I'm all ears! [14:55:17] 06serviceops, 10MW-on-K8s, 10RESTBase, 06SRE, 13Patch-For-Review: Migrate restbase from mwapi-async to mw-api-int - https://phabricator.wikimedia.org/T358213#9661855 (10Clement_Goubert) [14:59:25] 06serviceops, 10MW-on-K8s, 10RESTBase, 06SRE, 13Patch-For-Review: Migrate restbase from mwapi-async to mw-api-int - https://phabricator.wikimedia.org/T358213#9661885 (10Clement_Goubert) 10% of `RESTbase`'s backend `mwapi` requests are now made to `mw-api-int` {F43443512} [15:24:58] 06serviceops, 10iPoid-Service, 10Observability-Logging, 13Patch-For-Review: Logs from containers sometimes not visible in logstash - https://phabricator.wikimedia.org/T357616#9661969 (10JMeybohm) [26.03.24 14:55] (KubernetesRsyslogDown) firing: rsyslog on mw1483:9105 is missing kubernetes logs... [15:36:18] Hey folks, I drafted a plan for the registry nodes in https://phabricator.wikimedia.org/T360637#9662002 [15:36:23] 06serviceops, 06Machine-Learning-Team, 13Patch-For-Review: Bump memory for registry[12]00[34] VMs - https://phabricator.wikimedia.org/T360637#9662002 (10elukey) High level plan for codfw: * Book a mw infrastructure maintenance in the deployments wikitech page. * When the time comes, disable puppet on regist... [15:36:43] I am thinking to book a slot for tomorrow's EU mw infra maintenance window (https://wikitech.wikimedia.org/wiki/Deployments) [15:36:53] lemme know if you are ok or if anything looks weird [15:40:44] elukey: sgtm [15:40:52] thank you for taking care of it <3 [15:44:39] thank you folks for the patience! [15:44:49] I know it is a big ask but it will unblock us [16:04:06] 06serviceops, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate charts to Calico Network Policies - https://phabricator.wikimedia.org/T359423#9662136 (10JMeybohm) [16:06:39] 06serviceops, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate charts to Calico Network Policies - https://phabricator.wikimedia.org/T359423#9662147 (10brouberol) [16:18:34] 06serviceops, 06collaboration-services, 06Data-Persistence, 06DC-Ops, and 4 others: ☂️ Northward Datacentre Switchover (March 2024) - https://phabricator.wikimedia.org/T357547#9662223 (10jijiki) [16:28:21] TFW build-production-images wants to build a new java image... 🐌 [16:31:51] there's a reason why java's logo is a cup of coffee. time to get one. :P [16:32:21] I want my new 🐙 images already, darn it ;p [18:00:18] 06serviceops, 10Prod-Kubernetes: PodSecurityPolicies will be deprecated with Kubernetes 1.21 - https://phabricator.wikimedia.org/T273507#9662694 (10elukey) During the SIG meeting we wondered what is the feedback that a deployer would get from PSS vs VAP+CEL, we knew the latter (namely the Deployment/Pod/etc..... [18:12:05] 06serviceops, 10Prod-Kubernetes: PodSecurityPolicies will be deprecated with Kubernetes 1.21 - https://phabricator.wikimedia.org/T273507#9662741 (10elukey) Do the PSS give the same early feedback even with Deployment objects? Tested this random example: ` apiVersion: apps/v1 kind: Deployment metadata: name... [22:27:00] 06serviceops, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate charts to Calico Network Policies - https://phabricator.wikimedia.org/T359423#9663975 (10BTullis) [22:31:10] 06serviceops, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate charts to Calico Network Policies - https://phabricator.wikimedia.org/T359423#9663990 (10BTullis) I needed to start work on the DataHub migration because SSO broke due to the switch of IDP servers: https://sal... [22:58:19] 06serviceops, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate charts to Calico Network Policies - https://phabricator.wikimedia.org/T359423#9664036 (10BTullis) [23:08:34] 06serviceops, 06Data-Platform-SRE, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Migrate charts to Calico Network Policies - https://phabricator.wikimedia.org/T359423#9664049 (10BTullis)