[07:48:23] 06serviceops: kafka-main replacement nodes don't fit kafka-main (storage wise) - https://phabricator.wikimedia.org/T368714#9989171 (10dcausse) Perhaps something to consider as well is fine-tuning mirrormaker, I don't think that in the case of the wdqs updater we need the `*.rdf-streaming-updater.mutation*` topic... [07:53:16] 06serviceops, 10Wikidata, 10wmde-wikidata-tech, 10Data-Platform-SRE (2024.07.08 - 2024.07.28), 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9989174 (10dcausse) We are getting ready to deploy the... [11:03:27] hey, what's the quick way to check the status of mw-api-ext on k8s? [11:04:28] see -sre [11:21:23] volans: wdym? Grafana ? https://grafana.wikimedia.org/d/35WSHOjVk/application-servers-red-k8s?orgId=1&refresh=1m [11:21:45] claime: thx, nevermind, is just a single client probably, nothing systemic [11:31:11] 06serviceops, 06SRE, 13Patch-For-Review: mw2420-mw2451 do have unnecessary raid controllers (configured) - https://phabricator.wikimedia.org/T358489#9989694 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=db2972bf-cd24-4ee8-ba43-a5d1d6710956) set by cgoubert@cumin1002 for 7 days, 0:00:00... [12:45:09] 06serviceops, 10WMDE-TechWish-Maintenance, 07Epic, 10Maps (Kartotherian), 13Patch-For-Review: Move Kartotherian to Kubernetes - https://phabricator.wikimedia.org/T216826#9989878 (10Jgiannelos) Some context around the current status of maps: * The main problem we are facing is not that the current versio... [12:46:04] 06serviceops, 10MW-on-K8s, 10Observability-Logging: benthos mw-accesslog-metrics interpolation errors - https://phabricator.wikimedia.org/T370264 (10fgiunchedi) 03NEW [12:49:59] hello folks [12:50:22] do we have an overall plan/timing for the migration of the MW docker images to bullseye? [12:50:38] asking to keep a record of it so I know more or less when to expect it etc.., no pressure :) [13:54:03] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990195 (10Krinkle) See also: * (Declined) {T250205} * (Tracking task) {... [14:09:32] 06serviceops, 06Data-Platform-SRE, 10Dumps-Generation, 10MW-on-K8s, 06Release-Engineering-Team: Migrate current-generation dumps to run from our containerized images - https://phabricator.wikimedia.org/T352650#9990428 (10Gehel) [14:11:55] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990434 (10dr0ptp4kt) [14:16:08] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990444 (10dr0ptp4kt) [14:41:02] elukey: this is a long discussion, I will get back to you, which task would you like to reply on? [14:43:25] effie: thanks! I think that https://phabricator.wikimedia.org/T362981 would be fine [14:43:43] sorry https://phabricator.wikimedia.org/T356293 [14:44:42] k [14:44:57] 06serviceops, 10MW-on-K8s, 10Observability-Logging: benthos mw-accesslog-metrics interpolation errors - https://phabricator.wikimedia.org/T370264#9990550 (10kamila) AFAICT, this is only caused by extremely long URLs that result in truncated (and thus invalid) JSON -- see also T368417. These URLs are not "rea... [14:53:45] claime: 👋 Will you be around to try to deploy changeprop again? We reverted but i think we were overly cautious to something that happened in old deployments of changeprop too. [14:54:30] 06serviceops, 10MW-on-K8s, 10Observability-Logging: benthos mw-accesslog-metrics interpolation errors - https://phabricator.wikimedia.org/T370264#9990592 (10kamila) →14Duplicate dup:03T368417 [14:57:52] nemo-yiannis: I'm around, I have a meeting in a few minutes but I can keep an eye on things [14:58:00] ok [14:58:42] nemo-yiannis: was it a spike in backlog? [14:58:51] yes [15:00:28] nemo-yiannis: yeah it appears to have happened both when you deployed and when you reverted [15:00:44] yeah also last time we deployed around the 11th [15:03:51] claime do you know if changeprop is active/active or there is only one active cluster ? [15:04:53] nemo-yiannis: it's active active, each cluster deals with its own subset of message iiuc [15:06:40] ok [15:15:06] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990664 (10Milimetric) [15:22:37] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990692 (10daniel) [15:29:10] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990708 (10cscott) Another option is to subdivide pages into two categor... [15:32:34] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990749 (10Ottomata) > The number of resource_change and resource_purge... [15:33:54] 06serviceops, 10ChangeProp, 10MediaWiki-Core-HTTP-Cache, 06MediaWiki-Engineering, and 2 others: Reduce the number of resource_change and resource_purge events emitted due to template changes - https://phabricator.wikimedia.org/T369898#9990753 (10cscott) For completeness, another option is the varnish "x-ke... [16:31:31] 06serviceops, 10WMDE-TechWish-Maintenance, 07Epic, 10Maps (Kartotherian), 13Patch-For-Review: Move Kartotherian to Kubernetes - https://phabricator.wikimedia.org/T216826#9991038 (10elukey) Thanks a lot for all the detailed info, really useful (I am trying to get up to speed so apologies for all the borin... [19:19:53] 06serviceops, 06Infrastructure-Foundations, 13Patch-For-Review, 07Security: Upgrade K8s docker images running in Wikimedia production on Buster to either Bullseye or Bookworm - https://phabricator.wikimedia.org/T368366#9992344 (10Jdforrester-WMF)