[09:12:31] 10Acme-chief, 10SRE, 10Traffic-Icebox: Use acme-chief provided OCSP stapling responses - https://phabricator.wikimedia.org/T232988 (10Aklapper) @Vgutierrez: Could you please answer the last comment? Thanks in advance! [09:51:28] 10Acme-chief: Implement server-side OCSP stapling - https://phabricator.wikimedia.org/T219765 (10Vgutierrez) [09:51:49] 10Acme-chief, 10SRE, 10Traffic-Icebox: Use acme-chief provided OCSP stapling responses - https://phabricator.wikimedia.org/T232988 (10Vgutierrez) 05Open→03Resolved Oops.. yeah, let me close this. Cheers! [10:03:44] 10Traffic, 10DC-Ops, 10SRE, 10ops-ulsfo, 10Patch-For-Review: Q1:rack/setup/install ulsfo misc class hosts - https://phabricator.wikimedia.org/T317247 (10MoritzMuehlenhoff) ganeti4007 has been added to the ulsfo Ganeti cluster. [11:46:16] 10netops, 10Infrastructure-Foundations, 10SRE, 10User-jbond: Sporadic RST drops in the ulogd logs - https://phabricator.wikimedia.org/T238823 (10akosiaris) [14:37:46] XioNoX, topranks: I'm failing to find a reason on phabricator for the latency spike in eqsin beginning on December 14th: https://grafana.wikimedia.org/d/m1LYjVjnz/network-icmp-probes?orgId=1&from=now-7d&to=now&var-site=codfw&var-site=eqiad&var-target_site=drmrs&var-target_site=ulsfo&var-target_site=esams&var-target_site=eqsin&var-role=cr&var-family=All&viewPanel=19 [14:39:14] vgutierrez: Ultimately the reason is we have a packet-switched (E-LINE) service from Arelion to eqsin [14:39:36] As opposed to the wavelength services we use pretty much everywhere else (which should have consistent latency) [14:39:45] so whatever magic they do on their side affects our latency [14:39:50] ack [14:39:56] The routing on their network can change [14:40:23] I'm not 100% sure what SLAs / guarantees we have from them on it. There are probably some, let me have a look particularly at this one I'd not observed it myself [14:40:43] that's a 30% increase right? [14:41:06] at least for cr2-eqsin [14:42:35] rtt min/avg/max/mdev = 228.510/228.594/228.664/0.620 ms --> ping from bast5002 to bast2002 over IPv4 [14:42:52] rtt min/avg/max/mdev = 257.752/257.881/258.031/0.373 ms --> ping from bast5002 to bast2002 over IPv6 [14:43:01] :_) [14:43:16] IPv6 is obviously heavier [14:45:13] I observed similar before with them, they have ECMP in some places clearly [14:45:41] I found it wasn't just IPv6 vs IPv4, but different flows (src + dst IP combinations) getting switched out over different paths [14:45:55] yep [14:46:01] 30ms difference indicates it's taking a very different path though [14:46:17] We looked into the option of sourcing a wavelength from them last renewal but the pricing ruled it out [14:53:21] topranks: did we try finding another provider as well? [14:53:58] we spoke to a few if I recall, bandwidth in Singapore is quite expensive. [14:54:34] unfortunately this was all just before we started to focus in on the latency [14:54:55] so I guess the business case for minimal latency perhaps wasn't made as forcefully as it could have been [14:55:23] our other path goes via ulsfo, but it doesn't really make sense for us to push the traffic through their, when most of it comes from eqiad/codfw [14:56:17] I've been digging through the service emails and their portal to see if there is any SLA mentioned in terms of latency [14:57:09] doesn't seem to be, only SLAs are in terms of bandwidth that I see [14:58:25] so not really sure if there is much we can do, I can log a ticket to ask about it but I don't believe they are outside of contract [15:21:15] vgutierrez: task T325651 opened on this [15:21:22] I've raised a ticket with the carrier too. [15:21:54] We may not have a guarantee from them in terms of latency, but the discrepancy of some flows versus others is odd so I've focussed on that [15:21:55] topranks: thx <3 [15:22:14] np. ECMP is common, but really they shouldn't do ECMP in a way which takes a different WAN path between sites [15:22:30] Different flows going via different routers at a given site, and circuits between, sure [15:22:35] But 70ms difference is kind of crazy! [17:07:49] topranks: thanks!