[09:04:54] 10serviceops, 10Platform Team Initiatives (API Gateway): Update API gateway to newer version of Envoy - https://phabricator.wikimedia.org/T324130 (10JMeybohm) p:05Triage→03High [09:06:09] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Use cert-manager for service-proxy certificate creation - https://phabricator.wikimedia.org/T300033 (10JMeybohm) 05Open→03Stalled [09:06:36] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10Patch-For-Review: Use cert-manager for service-proxy certificate creation - https://phabricator.wikimedia.org/T300033 (10JMeybohm) [10:37:22] 10serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074 (10Joe) [10:38:39] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate mobileapps to k8s - https://phabricator.wikimedia.org/T350846 (10Joe) p:05Triage→03High [11:05:19] 10serviceops, 10iPoid-Service, 10Trust and Safety Product Sprint (Sprint Bodhrán): [M] Write CronJob configuration - https://phabricator.wikimedia.org/T346861 (10jijiki) [11:07:33] 10serviceops, 10iPoid-Service, 10Puppet: Rename FEED_API_KEY - https://phabricator.wikimedia.org/T350903 (10jijiki) [12:18:39] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, and 2 others: Migrate mobileapps to k8s - https://phabricator.wikimedia.org/T350846 (10Joe) I decided we should move about 10% of the mobileapps traffic at a time; that means about 300 rps, which I think we should be able to serve moving over about 2-3 api serve... [15:02:16] 10serviceops, 10ChangeProp, 10EventStreams, 10Image-Suggestion-API, and 5 others: Migrate node-based services in production to node12 - https://phabricator.wikimedia.org/T290750 (10Jdforrester-WMF) [15:02:35] 10serviceops, 10SRE, 10API Platform (RESTbase Deprecation Roadmap), 10Patch-For-Review: Migrate node-based services in production to node14 - https://phabricator.wikimedia.org/T306995 (10Jdforrester-WMF) [15:02:46] 10serviceops, 10API Platform (RESTbase Deprecation Roadmap), 10Patch-For-Review: Migrate node-based services in production to node16 - https://phabricator.wikimedia.org/T308371 (10Jdforrester-WMF) [15:32:34] _joe_: https://phabricator.wikimedia.org/T326002#9322464 [15:32:34] > I wonder if we should add retries to the envoy proxy to mw-api-int-async? This would affect all services that use that listener proxy though. @Joe ? [16:24:26] 10serviceops, 10MW-on-K8s, 10MediaWiki-Platform-Team, 10Patch-For-Review: mcrouter daemonset on mw-on-k8s - https://phabricator.wikimedia.org/T346690 (10Krinkle) a:05jijiki→03DAlangi_WMF [16:44:22] <_joe_> ottomata: I'd rather create a separate endpoint for read-only requests where to add retries [16:44:48] <_joe_> where we enforce going to the -ro endpoint, amongst other things [16:45:14] _joe_: that sounds good [16:45:24] we use to have an api-ro right? [16:45:26] _joe_: mw-api-int-async-ro exists already [16:45:30] ah [16:45:32] looking [16:45:35] maybe we are using the wrong one [16:45:51] - mw-api-int-async # used for EventStreamConfig API endpoint. [16:46:43] claime: _joe_ should we add retries to that one? [16:46:54] <_joe_> no [16:47:09] so we should have a new one mw-api-int-async-ro-with-retries ? [16:47:10] <_joe_> ah sorry you mean the one claime named? [16:47:14] yes [16:47:18] <_joe_> then yes [16:47:26] okay making patch... [16:47:43] <_joe_> but I'd like to also look into adding a filter for non-idempotent HTTP methods maybe, but not now :) [16:47:49] :) [16:49:45] https://gerrit.wikimedia.org/r/c/operations/puppet/+/973835 [16:52:01] _joe_: okay to merge? ^ [16:53:26] Those are the services currently having this endpoints setup: cirrus-streaming-updater rdf-streaming-updater recommendation-api-ng [16:53:29] <_joe_> +1'd [16:53:43] <_joe_> claime: it should be ok to allow retries [16:53:47] yep [16:53:55] okay cool, FYI dcausse pfischer ^^ just in case, i'll CC them on patch too [16:53:56] ty [16:55:39] 10serviceops, 10Prod-Kubernetes, 10Kubernetes, 10User-jijiki: Deploy kube-state-metrics - https://phabricator.wikimedia.org/T264625 (10kamila) With @JMeybohm's suggestion (T264625#9324445) we are at around **60k timeseries**. [16:55:46] qq, since all the mw-api listeners use the same port, the intention is that their use is are mutally exclusive, right? I don't need config changes for my service since it already uses port 6500? [16:55:59] just need to change the listener name ? [17:03:02] claime: ^ ? [17:07:41] ottomata: yeah (mwapi-async|mw-api-int-async|mw-api-int-async-ro) use port 6500, so you should just have to change the name [17:08:21] cool ty. [17:42:48] 10serviceops, 10MediaWiki-Platform-Team: Benchmark baremetal vs k8s mediawiki perf (2023) - https://phabricator.wikimedia.org/T333269 (10Krinkle) 05Open→03Stalled [17:42:56] 10serviceops, 10MW-on-K8s, 10SRE, 10Traffic, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Krinkle) [19:27:10] 10serviceops, 10CirrusSearch, 10MediaWiki-Configuration, 10MediaWiki-Engineering, 10Discovery-Search (Current work): Provide a method for internal services to run api requests for private wikis - https://phabricator.wikimedia.org/T345185 (10EBernhardson) Hmm, with the jobs providing a response directly i... [20:11:34] 10serviceops, 10AQS2.0, 10Cassandra, 10SRE, 10Service-deployment-requests: AQS 2.0 differentially private pageviews deploy API - https://phabricator.wikimedia.org/T343855 (10Htriedman) Any updates on this?