[08:18:57] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: Moving 1G servers out of rack D4 in prep of switch migration - https://phabricator.wikimedia.org/T361856#9897053 (10Clement_Goubert) Sure, no problem. Rescheduled. [08:56:35] 06serviceops, 06DC-Ops, 10ops-codfw: hw troubleshooting: management and main interface down for mw2321.codfw.wmnet - https://phabricator.wikimedia.org/T367702 (10Clement_Goubert) 03NEW [09:01:53] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9897373 (10ayounsi) I don't understand why the need to be moved to get upgraded to 10G. If we take for example wikikube-ctrl2001... [09:15:03] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: Port videoscaling to kubernetes - https://phabricator.wikimedia.org/T355292#9897489 (10Joe) After thinking a bit more, we can't run all the jobs from the same script directly, because multiversion. So we'd need to `fork()` to ru... [10:35:19] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897814 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2323 to wikikube-worker2003 completed: - mw2323 (**PASS**) - ✔️ Downtimed ho... [10:40:04] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897833 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2324 to wikikube-worker2004 completed: - mw2324 (**PASS**) - ✔️ Downtimed ho... [10:45:54] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897887 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2326 to wikikube-worker2007 completed: - mw2326 (**PASS**) - ✔️ Downtimed ho... [10:50:43] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: management and main interface down for mw2321.codfw.wmnet - https://phabricator.wikimedia.org/T367702#9897889 (10ops-monitoring-bot) Icinga downtime and Alertmanager silence (ID=5a0a3114-e7df-43df-8946-f917148b1d30) set by cgoubert@cumin1002 f... [10:50:59] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897890 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2327 to wikikube-worker2008 completed: - mw2327 (**PASS**) - ✔️ Downtimed ho... [10:52:24] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9897892 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-ctrl2003.co... [10:57:16] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897894 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2328 to wikikube-worker2009 completed: - mw2328 (**PASS**) - ✔️ Downtimed ho... [11:01:21] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9897897 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by kamila@cumin1002 for host wikikube-ctrl2003.co... [11:07:24] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897930 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw2329 to wikikube-worker2010 completed: - mw2329 (**PASS**) - ✔️ Downtimed ho... [11:07:44] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897933 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2003.codfw.wmnet with OS bullseye [11:08:22] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897934 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2004.codfw.wmnet with OS bullseye [11:08:39] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897937 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2007.codfw.wmnet with OS bullseye [11:09:06] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897938 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2008.codfw.wmnet with OS bullseye [11:09:29] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897939 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2009.codfw.wmnet with OS bullseye [11:09:44] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9897941 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker2010.codfw.wmnet with OS bullseye [11:47:59] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9898064 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by kamila@cumin1002 for host wikikube-ctrl2003.codfw.... [11:49:33] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9898066 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2004.codfw.wmnet with OS bullseye completed: - wikikube-work... [11:51:32] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9898070 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2007.codfw.wmnet with OS bullseye completed: - wikikube-work... [11:51:54] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9898071 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2003.codfw.wmnet with OS bullseye completed: - wikikube-work... [11:54:38] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9898077 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2009.codfw.wmnet with OS bullseye completed: - wikikube-work... [11:57:02] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 15), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9898086 (10SGupta-WMF) @hnowlan The service does have any api spec , which the... [11:58:46] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 15), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9898091 (10SGupta-WMF) @Scott_French The CI pipeline is ready , can you have a... [11:58:56] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9898092 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2010.codfw.wmnet with OS bullseye completed: - wikikube-work... [12:04:39] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9898117 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker2008.codfw.wmnet with OS bullseye completed: - wikikube-work... [12:06:52] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9898123 (10MoritzMuehlenhoff) [12:15:40] 06serviceops, 06DC-Ops, 10ops-codfw, 10Prod-Kubernetes, 07Kubernetes: Relabel codfw kubernetes nodes - https://phabricator.wikimedia.org/T367736 (10Clement_Goubert) 03NEW [13:06:19] Folks we got some netbox inconsistency report errors for the 'wikikube-worker' nodes, calling out that they've no DNS records for IPv6 [13:06:38] I assume that is expected and we need to add that hostname prefix to the list of exceptions we have in the report?? [13:06:46] no probs to fix just want to make sure [13:18:18] there are probably some of the older mw nodes that were added before we were adding AAAA RRs to everything [13:18:32] * akosiaris searching task [13:20:18] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9898564 (10Gehel) p:05Triage→03High [13:21:35] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9898558 (10Gehel) [13:37:14] topranks: yeah saw that this morning, I fixed a few manually but there are too many [13:37:59] we could probably add AAAA records to all of them, even if they're still bare-metal appservers, so that we don't have issues when renaming them [13:38:02] claime, akosiaris: ah ok thanks for the info [13:38:11] topranks: Which are missing AAAA btw? [13:38:32] yep leave that to me I can script it quickly with the netbox api [13:38:38] one sec [13:38:40] cool <3 [13:40:24] https://phabricator.wikimedia.org/T271142 [13:40:30] wikikube-worker1001, 1008-1010 were highlighted in the latest report [13:43:11] thanks Alex, I’ll also compare our list of “exclusions” in the report see if any can be removed based on the latest work in that task [13:43:14] Nice one [13:43:16] topranks: ok can you fix these please, and give me the netbox python code to do it so I can integrate it to our rename/reimage script? that way we won't have to bother you, and I won't have to click around [13:43:31] I bet the exclusion list is based on the names right? [13:43:43] yeah, it's the renaming that triggered this [13:43:47] so we don't have reports for the mw* nodes that were reimaged to k8s without AAAA records [13:43:53] yeah it’s a simple string comp, name.startswith [13:44:22] otherwise they were probably excluded when named mw* [13:44:30] yeah so there's probably a bunch of mw* nodes that don't have AAAA [13:44:36] now that they are all k8s nodes, we know we can just fix this without any risk [13:44:52] well they're not all k8s nodes just yet :D [13:44:57] true [13:45:20] but maybe we can just run once the script for all mw* and parse* nodes and just finish this in 1 go? [13:45:24] but idk if there really is a risk to add AAAA records to baremetal servers, is there? [13:45:31] instead of adding it to the rename script that is [13:45:32] Ah, memecached [13:45:40] apt typo [13:46:25] all mc* hosts are with AAAA records now. And, as far as memcached protocol goes, it's useless since mcrouter addresses them over IPv4 [13:46:36] so there isn't any risk on that front [13:47:11] akosiaris: ok yep good to know [13:47:12] so agreed we should add AAAA records to all mw* nodes, even those that are still baremetal? [13:47:29] yeah, I think at the traffic levels we have on baremetal now, we are ok [13:47:37] and yes, sounds like perhaps the rename cookbook should have an option to "add v6 dns", or a separate cookbook for it [13:48:17] if it's safe to do we can do a one-off script to add the AAAAs to all existing nodes of a given type [13:48:23] topranks: honestly right now, I think the simplest is if you can run your netbox-fu to add AAAA records to all mw* and parse* servers that don't have them, as well as wikikuke-worker [13:48:25] yep, that :D [13:48:27] <3 [13:49:31] I'm going to reimage a few more appservers in eqiad this afternoon, and if I can squeeze my deployment of statsd today or tomorrow, we'll probably be at 100% external traffic before the end of the week [13:49:34] 👍 [13:49:36] to k8s [13:50:43] 06serviceops, 10Dumps-Generation, 06Infrastructure-Foundations, 10SRE-tools, 07IPv6: Some Service Operations clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271142#9898774 (10akosiaris) [13:50:45] claime: ok I'll go ahead and do that, easy to do [13:50:51] thanks a bunch <3 [13:51:04] few meetings first I should get to it later today [13:51:43] ok then I'll go ahead and do my renames/reimages and do the AAAA records for them manually [13:53:26] 06serviceops, 10Dumps-Generation, 06Infrastructure-Foundations, 10SRE-tools, 07IPv6: Some Service Operations clusters apparently do not support IPv6 - https://phabricator.wikimedia.org/T271142#9898785 (10akosiaris) 05Open→03Resolved a:03akosiaris I 've removed * dumpsdata[1001-1003].eqiad.wmn... [14:14:31] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9898890 (10Papaul) @ayounsi wikikube-ctrl2001 is racked on u13 if we move it and plug it in port 44-47 it will mess up the hard w... [14:44:07] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9899054 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by kamila@cumin1002 for hosts: `wikikube-ctrl2001.codfw.... [14:51:32] 06serviceops, 06DC-Ops, 10ops-eqiad: hw troubleshooting: firmware upgrade for mw1359.eqiad.wmnet, mw1364.eqiad.wmnet, mw1365.eqiad.wmnet, mw1412.eqiad.wmnet - https://phabricator.wikimedia.org/T367766 (10Clement_Goubert) 03NEW [15:05:35] 06serviceops, 06collaboration-services, 06Infrastructure-Foundations, 10Puppet-Core, and 5 others: Migrate roles to puppet7 - https://phabricator.wikimedia.org/T349619#9899198 (10Gehel) [15:07:45] 06serviceops, 06Infrastructure-Foundations, 10Data-Platform-SRE (2024.06.17 - 2024.07.07), 13Patch-For-Review: Create a helm chart for the cloudnativepg postgresql operator - https://phabricator.wikimedia.org/T364797#9899202 (10Gehel) [15:16:08] 06serviceops, 10MW-on-K8s, 10TimedMediaHandler, 13Patch-For-Review, 07Video: Port videoscaling to kubernetes - https://phabricator.wikimedia.org/T355292#9899340 (10Joe) And further thought made me realize this would re-present the issue of running old code and/or what to do with pods when there's a relea... [15:29:05] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9899430 (10ops-monitoring-bot) cookbooks.sre.hosts.decommission executed by kamila@cumin1002 for hosts: `wikikube-ctrl2002.codfw.... [15:33:10] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE-OnFire, 10Sustainability (Incident Followup): codfw:(3) wikikube-ctrl NIC upgrade to 10G - https://phabricator.wikimedia.org/T366205#9899455 (10kamila) wikikube-ctrl1001 looks happy, thanks for the help! I have decommed wikikube-ctrl1001 and 1002, they're good... [15:33:55] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899458 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw1444 to wikikube-worker1019 completed: - mw1444 (**PASS**) - ✔️ Downtimed ho... [15:40:23] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899473 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw1447 to wikikube-worker1020 completed: - mw1447 (**PASS**) - ✔️ Downtimed ho... [15:45:09] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899496 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.rename started by cgoubert@cumin1002 from mw1489 to wikikube-worker1021 completed: - mw1489 (**PASS**) - ✔️ Downtimed ho... [15:46:30] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899501 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker1019.eqiad.wmnet with OS bullseye [15:46:49] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899503 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker1020.eqiad.wmnet with OS bullseye [15:47:07] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899506 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker1021.eqiad.wmnet with OS bullseye [15:55:14] 06serviceops, 06Data-Platform-SRE, 10Wikidata, 10wmde-wikidata-tech, 03Discovery-Search (Current work): Request permission to create 4 kafka topics in kafka-main (WDQS graph split) - https://phabricator.wikimedia.org/T367510#9899556 (10Clement_Goubert) Disk usage looks ok (all servers are around 30% usag... [16:10:44] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899649 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker1019.eqiad.wmnet with OS bullseye executed with errors: - wi... [16:10:48] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899665 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin1002 for host wikikube-worker1019.eqiad.wmnet with OS bullseye [16:48:10] 06serviceops, 06DC-Ops, 10ops-eqiad, 06SRE: hw troubleshooting: firmware upgrade for mw1359.eqiad.wmnet, mw1364.eqiad.wmnet, mw1365.eqiad.wmnet, mw1412.eqiad.wmnet - https://phabricator.wikimedia.org/T367766#9899861 (10Clement_Goubert) [16:51:15] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899867 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker1020.eqiad.wmnet with OS bullseye completed: - wikikube-work... [16:55:47] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899876 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker1019.eqiad.wmnet with OS bullseye completed: - wikikube-work... [16:59:05] 06serviceops, 10MW-on-K8s: Move servers from the appserver/api cluster to kubernetes - https://phabricator.wikimedia.org/T351074#9899894 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin1002 for host wikikube-worker1021.eqiad.wmnet with OS bullseye completed: - wikikube-work... [17:06:02] 06serviceops, 06DC-Ops, 10ops-eqiad, 10Prod-Kubernetes, 07Kubernetes: Relabel eqiad kubernetes nodes - https://phabricator.wikimedia.org/T367789 (10Clement_Goubert) 03NEW [20:50:57] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: management and main interface down for mw2321.codfw.wmnet - https://phabricator.wikimedia.org/T367702#9901172 (10Jdforrester-WMF) Could mw2321 be de-pooled? It's still in the scap target list: ` 20:49:07 /usr/bin/sudo /usr/local/sbin/mediawik... [21:06:31] 06serviceops, 06DC-Ops, 10ops-codfw, 06SRE: hw troubleshooting: management and main interface down for mw2321.codfw.wmnet - https://phabricator.wikimedia.org/T367702#9901260 (10Dzahn) mw2321 is already depooled=inactive in confctl. I think the issue that this is isn't sufficient to make it disappear for de... [22:48:12] 06serviceops, 06SRE, 10Data Products (Data Products Sprint 15), 13Patch-For-Review, 07Service-deployment-requests: Commons Impact Metrics AQS 2.0 Deployment to Staging and Production - https://phabricator.wikimedia.org/T361835#9901569 (10Scott_French) @SGupta-WMF - Thanks for letting me know. Given your... [23:28:54] 06serviceops, 10Cassandra, 06Data Products, 06SRE, and 2 others: Commons Impact Metrics: Data Gateway endpoints - https://phabricator.wikimedia.org/T364921#9901692 (10Scott_French) From T366851: We now understand the slow-client-startup issue to be the result of connection timeouts when new(er) versions of... [23:31:26] 06serviceops, 10Cassandra, 06Data Products, 06SRE, and 2 others: Commons Impact Metrics: Data Gateway endpoints - https://phabricator.wikimedia.org/T364921#9901698 (10Scott_French)