[01:12:16] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: Q4:rack/setup/install kafka-main200[6789] & kafka-main2010 - https://phabricator.wikimedia.org/T363209#9802674 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host kafka-main2009.codfw.wmnet with OS bullseye executed... [06:35:38] Anyone with knowledge of Blubber that can do a review ? [06:36:03] Usage of Blubber, not development on Blubber that is [07:46:41] 06serviceops, 10CirrusSearch, 03Discovery-Search (Current work), 13Patch-For-Review: Implement global ratelimiting in our service mesh - https://phabricator.wikimedia.org/T362310#9803215 (10JMeybohm) `Successfully published image docker-registry.discovery.wmnet/ratelimit:9.0.2-20240503.3fcc360`, supporting... [08:32:15] dcausse: I will do a deployment on rdf-streaming updater, is that ok? it is to remove the old stuff [08:32:25] effie: sure [08:32:29] tx [08:32:39] on staging there is a [08:32:41] - restartNonce: 3 [08:32:41] + restartNonce: 1 [08:32:43] is that cool ? [08:32:53] effie: yes you can ignore this [08:32:59] excellent [08:33:04] 06serviceops, 10Prod-Kubernetes, 10Data-Platform-SRE (2024.05.06 - 2024.05.26), 07Kubernetes, 13Patch-For-Review: Allow to address Kubernetes API servers from NetworkPolicy - https://phabricator.wikimedia.org/T287491#9803342 (10BTullis) [09:15:56] o/ I need to run a backfill for wikidata, might cause an additional ~400rps on mw-api-int@eqiad for about 6hours, looking at grafana I believe that the cluster can handle it but please let me know if you have concerns about this [09:24:21] 06serviceops, 10Prod-Kubernetes, 10Data-Platform-SRE (2024.05.06 - 2024.05.26), 07Kubernetes, 13Patch-For-Review: Allow to address Kubernetes API servers from NetworkPolicy - https://phabricator.wikimedia.org/T287491#9803609 (10jijiki) @klausman you can go head with `kserve` and `knative-serving` when y... [09:26:02] jelto: hi, I have merged a change for the Gitlab trusted runners for brouberol but I realized I have no idea how to actually deploy it. Do you know how to do that ? The merged patch was https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/commit/c1d5e1637579bbcd865418c4f531c94620b3e8a5 [09:26:23] dcausse: should be all right [09:26:54] claime: thanks, starting it [09:29:08] hashar: you just have to run the manual job here: https://gitlab.wikimedia.org/repos/releng/gitlab-trusted-runner/-/jobs/262361 [09:29:08] Btw we have a public wikimedia-sre-collab irc channel now [09:31:09] 06serviceops, 10Prod-Kubernetes, 10Data-Platform-SRE (2024.05.06 - 2024.05.26), 07Kubernetes, 13Patch-For-Review: Allow to address Kubernetes API servers from NetworkPolicy - https://phabricator.wikimedia.org/T287491#9803623 (10klausman) >>! In T287491#9803609, @jijiki wrote: > @klausman you can go head... [09:33:34] jelto: ah excellent thank you [09:41:11] 06serviceops: docker-reporter-base-images.service failed on build2001 - https://phabricator.wikimedia.org/T364931#9803682 (10Clement_Goubert) [09:41:15] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9803683 (10Clement_Goubert) [09:47:06] 06serviceops: docker-reporter-base-images.service failed on build2001 - https://phabricator.wikimedia.org/T364931#9803694 (10Clement_Goubert) p:05Triage→03Low I will update the parent task with the leftover images. In the meantime I can silence it until I'm back from vacation and will take care of it once ba... [09:48:23] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9803705 (10Clement_Goubert) Still left to either remove or rebuild: * base images ` docker-registry.wikimedia.org/docker-gc:1.0.0-20230402 [FAIL] docke... [09:53:15] 06serviceops: docker-reporter-base-images.service failed on build2001 - https://phabricator.wikimedia.org/T364931#9803735 (10Clement_Goubert) Created silence `d57b75c8-7b05-4651-8ec5-4bdb13d464f7` until Monday May 27th [10:45:23] 06serviceops, 10Wikimedia-Apache-configuration, 10Wikimedia-Site-requests, 13Patch-For-Review: Temporarily redirect sgs.wikipedia.org to bat-smg.wikipedia.org until bat-smg->sgs move can be done - https://phabricator.wikimedia.org/T204830#9803912 (10Clement_Goubert) The redirect from sgs.wikipedia.org to b... [12:14:42] 06serviceops: Provide nodejs20 base images for production - https://phabricator.wikimedia.org/T362681#9804236 (10KartikMistry) It seems the image has some issues with dependencies. See: https://integration.wikimedia.org/ci/job/cxserver-pipeline-test/582/console cc @MoritzMuehlenhoff [12:19:32] 06serviceops, 10Prod-Kubernetes, 07Kubernetes, 13Patch-For-Review: Co-locate kube-apiserver and etcd on new staging control plane nodes - https://phabricator.wikimedia.org/T363307#9804242 (10JMeybohm) [12:44:57] 06serviceops: Provide nodejs20 base images for production - https://phabricator.wikimedia.org/T362681#9804306 (10Jdforrester-WMF) >>! In T362681#9804236, @KartikMistry wrote: > It seems the image has some issues with dependencies. See: https://integration.wikimedia.org/ci/job/cxserver-pipeline-test/582/conso... [13:06:57] 06serviceops, 10Wikimedia-Apache-configuration, 10Wikimedia-Site-requests, 13Patch-For-Review: Temporarily redirect sgs.wikipedia.org to bat-smg.wikipedia.org until bat-smg->sgs move can be done - https://phabricator.wikimedia.org/T204830#9804353 (10Fomafix) 05Open→03Resolved a:03Fomafix [13:11:08] 06serviceops, 06Machine-Learning-Team, 13Patch-For-Review: Rename the envoy's uses_ingress option to sets_sni - https://phabricator.wikimedia.org/T346638#9804401 (10JMeybohm) [14:32:19] hi serviceops. for GETs against sessionstore that begin with /sessions/v1/enwiki%3AMWSession%3A ... is the part that follows PII? [14:36:17] good question - I suspect yes but urandom would know for sure [14:55:45] 06serviceops, 06Infrastructure-Foundations, 06Release-Engineering-Team: Deprecate buster-backports - https://phabricator.wikimedia.org/T362518#9805179 (10dancy) >>! In T362518#9803705, @Clement_Goubert wrote: > Still left to either remove or rebuild: > * base images > ` > docker-registry.wikimedia.org/docke... [14:56:34] 06serviceops, 06MW-Interfaces-Team, 06Traffic: map the /api/ prefix to /w/rest.php - https://phabricator.wikimedia.org/T364400#9805184 (10daniel) a:03daniel [15:01:38] cdanis: it's a mediawiki session, so I would say: almost certainly yes [15:01:45] cool, thanks [15:02:02] that's one reason for sessionstore being its own isolated cluster [15:05:51] I might redact it from tracing information, not sure yet [15:05:57] certainly I won't display it at a high level there [15:08:15] auh, yeah [15:08:42] it isn't hard at all to write and deploy such rules, btw [15:08:48] I'll probably use this one as an example [16:32:32] 06serviceops: [API Gateway] Get insight into proxy time for Envoy - https://phabricator.wikimedia.org/T297222#9805744 (10Aklapper) a:05hnowlan→03None @hnowlan: Removing task assignee as this open task has been assigned for more than two years - see the email sent to all task assignees on 2024-04-15. Please... [16:33:18] 06serviceops, 10MediaWiki-Core-JobQueue, 10WMF-JobQueue: Enable MW REST API on job runners and video scalers (for the new rest.php job executor) - https://phabricator.wikimedia.org/T246389#9805758 (10Aklapper) a:05hnowlan→03None @hnowlan: Removing task assignee as this open task has been assigned for mo... [16:33:25] 06serviceops, 10API Platform, 13Patch-Needs-Improvement: API Gateway has missed its write latency SLO - https://phabricator.wikimedia.org/T294445#9805754 (10Aklapper) a:05hnowlan→03None @hnowlan: Removing task assignee as this open task has been assigned for more than two years - see the email sent to a... [16:33:54] 06serviceops, 10Beta-Cluster-Infrastructure, 13Patch-Needs-Improvement: Implement API Gateway solution for deployment-prep - https://phabricator.wikimedia.org/T254917#9805756 (10Aklapper) a:05hnowlan→03None @hnowlan: Removing task assignee as this open task has been assigned for more than two years - se... [16:37:09] 06serviceops, 10MediaWiki-Core-JobQueue, 10WMF-JobQueue: Enable MW REST API on job runners and video scalers (for the new rest.php job executor) - https://phabricator.wikimedia.org/T246389#9805825 (10hnowlan) 05Open→03Resolved a:03hnowlan Resolved as part of k8s migration [17:00:45] 06serviceops: Provide nodejs20 base images for production - https://phabricator.wikimedia.org/T362681#9806185 (10KartikMistry) >>! In T362681#9804306, @Jdforrester-WMF wrote: >>>! In T362681#9804236, @KartikMistry wrote: >> It seems the image has some issues with dependencies. See: https://integration.wikime... [17:08:24] 06serviceops, 10envoy: Using port in Host header for thanos-swift / thanos-query breaks vhost selection - https://phabricator.wikimedia.org/T300119#9806265 (10Aklapper) a:05Joe→03None @Joe: Removing task assignee as this open task has been assigned for more than two years - see the email sent to all task a... [17:12:49] 06serviceops: Provide nodejs20 base images for production - https://phabricator.wikimedia.org/T362681#9806364 (10Jdforrester-WMF) [22:44:20] 06serviceops, 10Cassandra, 06SRE, 10Data Products (Data Products Sprint 13), and 2 others: Commons Impact Metrics: Data Gateway endpoints - https://phabricator.wikimedia.org/T364921#9807339 (10Scott_French) 05Open→03In progress [23:09:46] 06serviceops, 10Cassandra, 06SRE, 10Data Products (Data Products Sprint 13), and 2 others: Commons Impact Metrics: Data Gateway endpoints - https://phabricator.wikimedia.org/T364921#9807379 (10Scott_French) Many thanks for getting the image builds running and setting up the data_gateway role, @Eevans. Wit...