[00:30:36] 06serviceops, 10MediaWiki-Platform-Team (Radar): Enable extstore to a subset of memcached servers (experiment) - https://phabricator.wikimedia.org/T352885#9955407 (10Krinkle) We discussed this in the MwEng-SvcOps meeting (13 June 2024). Extstore was enabled on a few hosts in the DC. Some stats differered, but... [08:41:33] 06serviceops, 10Wikifeeds: Wikifeeds' tls proxy cpu usage heavily increased in April - https://phabricator.wikimedia.org/T368238#9955846 (10elukey) To keep archives happy - this is the result after some days: {F56234788} I'd like to lower down the concurrency again to see if we get more benefits. [09:58:26] elukey: I can test that today [09:59:11] hnowlan: <3 [09:59:41] I have also the new api/rest gateways in staging with the new envoy images [09:59:52] if you have time, otherwise next week, no rush! [10:20:54] 06serviceops, 10Wikifeeds, 13Patch-For-Review: Wikifeeds' tls proxy cpu usage heavily increased in April - https://phabricator.wikimedia.org/T368238#9956249 (10akosiaris) >>! In T368238#9955846, @elukey wrote: > To keep archives happy - this is the result after some days: > > {F56234788} > > I'd like to lo... [10:42:07] elukey: thumbor looks fine [10:44:17] \o/ [10:44:26] let's deploy now! [10:44:29] * elukey kidding [10:44:31] sgtm! [10:44:39] we can schedule sometime next week :) [10:52:22] :D [11:30:08] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 06SRE: kubernetes1051.eqiad.wmnet failed to pull mediawiki images - https://phabricator.wikimedia.org/T369011#9956418 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=53cf057a-4641-401a-ab84-392d5d8f2444) set by cgoubert@cumin1002... [12:00:18] 06serviceops, 10MW-on-K8s, 10Scap, 13Patch-For-Review: Evaluate the performance improvements brought in by prefetching MW images on WikiKube hosts - https://phabricator.wikimedia.org/T366778#9956494 (10akosiaris) While it's a bit early to gauge this: * there is a 10% in the peaks of docker node pulls the... [12:17:40] elukey: gateways also look good so far [12:55:17] hnowlan: niceee [12:55:27] perfect, we can deploy that as well next week [12:55:54] if rest/api gateway envoys + wikifeeds mesh envoy work fine, we can set the new version for the mesh's default as well [12:56:01] but the rollout will take a bit of time [13:30:56] nice [18:28:46] 06serviceops, 10MediaWiki-Uploading, 10SRE-swift-storage: Upload errors due to swift failures, 503s - https://phabricator.wikimedia.org/T369388 (10TheDJ) 03NEW [18:31:03] 06serviceops, 10MediaWiki-Uploading, 10SRE-swift-storage: Upload errors due to swift failures, 503s - https://phabricator.wikimedia.org/T369388#9957443 (10TheDJ) p:05Triage→03Unbreak! [18:31:42] 06serviceops, 10MediaWiki-Uploading, 10SRE-swift-storage: Upload errors due to swift failures, 503s - https://phabricator.wikimedia.org/T369388#9957441 (10TheDJ) [18:39:14] 06serviceops, 10MediaWiki-Uploading, 10SRE-swift-storage: Upload errors due to swift failures, 503s - https://phabricator.wikimedia.org/T369388#9957445 (10TheDJ) [18:43:09] 06serviceops, 10MediaWiki-Uploading, 10SRE-swift-storage: Upload errors due to swift failures, 503s - https://phabricator.wikimedia.org/T369388#9957452 (10andrea.denisse) Thanks for reporting the issue, I'm investigating it and I've shared it with fellow SRE's for advice. [19:00:38] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 16), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9957464 (10mforns)