[11:43:57] I need 10' to get to my laptop [11:45:08] I'm here [11:47:42] me too [11:48:04] so it is TransitPeeringTransportOutSaturation network sre (cr1-codfw:9804 Transport: cr3-eqsin:xe-0/1/0 (Arelion, IC-331929 200ms EVPN) {#11991_12273-3} xe-1/0/1:2 gnmi codfw) [11:48:14] https://grafana.wikimedia.org/d/5p97dAASz/queue-and-error-stats-by-network-device?orgId=1&from=now-6h&to=now&timezone=utc&var-site=000000022&var-device=cr3-eqsin:9804&var-interface=$__all&refresh=30s [11:48:17] visible here [11:49:18] I have been experimenting a bit with this dashboard [11:49:18] https://grafana-rw.wikimedia.org/d/abb02966-5ee7-48dc-8d81-2163492ad3d7/scraper-square-one [11:50:04] ah nice, at least we can rule out a lot of stuff right away [11:50:19] https://grafana-rw.wikimedia.org/d/7d07a703-4ea9-40b8-b9ae-ae218e53ff15/upload-square-one [11:51:13] it looks like every other link has reduced traffic [11:52:26] it is again eqsin and magru like 2 days ago [11:52:54] it is just that this time [11:53:31] saturation recovered [11:53:34] ok let me see if I can work out a pattern from turnilo [11:59:21] I have this that matches the timing: https://w.wiki/HPck [12:01:04] yes [12:01:11] (lets cont on _sec