[00:29:37] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: firmware upgrade for mw1359.eqiad.wmnet, mw1364.eqiad.wmnet, mw1365.eqiad.wmnet, mw1412.eqiad.wmnet - https://phabricator.wikimedia.org/T367766#9901830 (10Jclark-ctr) @clement_goubert did you need just idrac updated we can do that easily. B... [05:32:12] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: Port videoscaling to kubernetes - https://phabricator.wikimedia.org/T355292#9902070 (10Joe) As an additional complexity to the system, it would be great if we could batch requests. In practice, I think we'd want to only batch re... [08:27:45] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9902307 (10akosiaris) Yeah, I confirm. The older hosts in the clusters, `kafka-m... [08:44:34] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9902362 (10dcausse) >>! In T367510#9902307, @akosiaris wrote: > I do have one qu... [08:46:22] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9902381 (10akosiaris) >>! In T367510#9902362, @dcausse wrote: >>>! In T367510#99... [08:48:41] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9902394 (10dcausse) [08:55:47] 06serviceops, 10Wikidata, 10wmde-wikidata-tech, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9902415 (10Gehel) [09:19:43] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: firmware upgrade for mw1359.eqiad.wmnet, mw1364.eqiad.wmnet, mw1365.eqiad.wmnet, mw1412.eqiad.wmnet - https://phabricator.wikimedia.org/T367766#9902528 (10Clement_Goubert) Yes, idrac should be enough, thank you. [09:50:20] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: management and main interface down for mw2321.codfw.wmnet - https://phabricator.wikimedia.org/T367702#9902628 (10Clement_Goubert) Yes, it is only for the `docker_pull_k8s` step, for which failures are not critical unless a lot of hosts fail it... [09:57:09] 06serviceops, 06Release-Engineering-Team, 10Scap: Use conftool to build scap's kubernetes_workers host list - https://phabricator.wikimedia.org/T367862 (10Clement_Goubert) 03NEW [10:04:41] 06serviceops, 06Release-Engineering-Team, 10Scap: Use conftool to build scap's kubernetes_workers host list - https://phabricator.wikimedia.org/T367862#9902664 (10Clement_Goubert) {T366778} is looking towards removing this step completely [10:42:36] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 15), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9902761 (10SGupta-WMF) @Scott_French I am waiting for final go ahead from QA .... [11:06:24] 06serviceops, 06Traffic, 13Patch-For-Review, 10Release Pipeline (Blubber), 10Release-Engineering-Team (Priority Backlog 📥): Remove blubberoid LVS/k8s service - https://phabricator.wikimedia.org/T365742#9902842 (10JMeybohm) [11:13:16] I'm going to start moving sessionstore in staging to use envoy tls - afaict nothing is hitting it but just mentioning it (cc urandom) [11:13:33] diffs for prod look like it'll be unaffected [11:16:12] 06serviceops, 10MoveComms-Support, 10MW-on-K8s, 06SRE, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9902871 (10Clement_Goubert) [11:30:31] claime, akosiaris: being cautious here on adding those IPv6 dns records [11:30:35] script in dry-run mode is going to add these: [11:30:36] https://phabricator.wikimedia.org/P65153 [11:30:38] seems ok? [11:30:43] looking [11:31:06] thanks [11:31:12] I did the wikikube-worker ones already [11:31:18] I was about to ask :) [11:31:22] Thanks <3 [11:31:46] looks ok to me [11:32:48] cool yep, I log in operations when I run the dns cookbook to create them [11:33:03] fwiw this is the script https://github.com/topranks/random_wmf/blob/main/netbox_scripts/add_v6_dns.py [11:35:15] Nice [11:37:22] topranks: Does it make sense to update wikitech? https://wikitech.wikimedia.org/wiki/DNS/Netbox#Multiple_hosts points to a phab task with a script for the old api [11:37:25] oh good call, yeah I'll have a look an update [11:38:00] usually any quick-scripts like that we try not to have in wikitech, should be cookbooks or something but I guess one or two is ok [11:38:37] claime: what do you mean by old api? [11:39:49] volans: it's a link to a phab comment where you ran some custom-code in nbshell [11:40:00] which looks different to what I did with pynetbox [11:40:56] ah different lib [11:40:58] ok [11:41:53] yes I run nbshell and that's what I've done every time I was pinged about mass-IP changes [11:43:32] claime: one is the Django-native way directly on the netbox server, the other is going via the REST API it exposes, so just slightly different syntax [11:49:47] Does anyone know how we get new fields added to the logstash mapping here so they can be indexed? https://logstash.wikimedia.org/app/management/opensearch-dashboards/indexPatterns/patterns/logstash-*#/?_a=h@9293420 ? [11:51:10] mvolz: you'd have to ask observability I think [11:53:03] tnx! err.. do you know where they live :P? [11:53:56] #wikimedia-observability :P [11:54:30] 👍️ [11:57:08] 06serviceops, 10MW-on-K8s, 10Observability-Metrics, 13Patch-For-Review, 10SRE Observability (FY2023/2024-Q4): Create a per-release deployment of statsd-exporter for mw-on-k8s - https://phabricator.wikimedia.org/T365265#9902980 (10Clement_Goubert) `statsd-exporter` is now deployed on all #mw-on-k8s deploy... [12:01:04] 06serviceops, 06Release-Engineering-Team, 10Scap, 13Patch-For-Review: Use conftool to build scap's kubernetes_workers host list - https://phabricator.wikimedia.org/T367862#9902987 (10Clement_Goubert) 05Open→03In progress p:05Triage→03Medium [12:07:31] FYI just pushed those dns changes with the cookbook [12:07:54] 06serviceops, 10MW-on-K8s, 10MediaWiki-Platform-Team (Radar): Allow php-fpm to read environment variables from the system, not just from the fcgi request - https://phabricator.wikimedia.org/T326705#9903029 (10Clement_Goubert) 05Open→03Resolved a:03Clement_Goubert Now done using `php.envvars` in the... [12:10:39] topranks: <3 thank you very much [12:54:10] 06serviceops, 10MW-on-K8s, 10MediaWiki-Platform-Team (Radar), 13Patch-For-Review: mcrouter daemonset on mw-on-k8s - https://phabricator.wikimedia.org/T346690#9903173 (10jijiki) We attempted to rollout on eqiad, where mediawiki would be using mcrouter;s cluster IP directly, but we started seeing many errors... [13:33:14] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: Port videoscaling to kubernetes - https://phabricator.wikimedia.org/T355292#9903308 (10Joe) I'm still trying to think of alternatives that wouldn't use the k8s api. One possibility would be: # With every deployment, we also depl... [14:01:28] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Set AppArmor profile via SecurityContext rather than annotations (k8s >=1.30) - https://phabricator.wikimedia.org/T367880 (10JMeybohm) 03NEW [14:01:30] 06serviceops, 10Prod-Kubernetes, 07Kubernetes: Set AppArmor profile via SecurityContext rather than annotations (k8s >=1.30) - https://phabricator.wikimedia.org/T367880#9903453 (10JMeybohm) p:05Triage→03Low [14:47:23] 06serviceops, 10MW-on-K8s, 10Scap, 13Patch-For-Review: Evaluate the performance improvements brought in by prefetching MW images on WikiKube hosts - https://phabricator.wikimedia.org/T366778#9903647 (10akosiaris) [15:01:22] 06serviceops, 10MoveComms-Support, 10MW-on-K8s, 06SRE, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9903731 (10Clement_Goubert) [15:03:44] 06serviceops, 10MoveComms-Support, 10MW-on-K8s, 06SRE, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9903729 (10Clement_Goubert) {F55438321} 🚀🚀🚀 [15:08:30] 06serviceops, 10MoveComms-Support, 10MW-on-K8s, 06SRE, and 2 others: Move 100% of external traffic to Kubernetes - https://phabricator.wikimedia.org/T362323#9903765 (10Ladsgroup) {meme, src=itshappening} [15:38:04] 06serviceops, 10docker-pkg, 06Release-Engineering-Team: Attach opencontainers image metadata to docker images - https://phabricator.wikimedia.org/T345070#9903864 (10elukey)