[05:41:02] <jinxer-wm>	 (HAProxyEdgeTrafficDrop) firing: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop
[05:43:55] <wikibugs>	 10Traffic: upstream connect error or disconnect/reset before headers. reset reason: overflow - https://phabricator.wikimedia.org/T307647 (10AlexisJazz)
[05:45:56] <jinxer-wm>	 (HAProxyEdgeTrafficDrop) firing: (2) 68% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop  - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop
[05:47:51] <wikibugs>	 10Traffic, 10Wikimedia-production-error: upstream connect error or disconnect/reset before headers. reset reason: overflow - https://phabricator.wikimedia.org/T307647 (10AlexisJazz)
[05:56:01] <jinxer-wm>	 (HAProxyEdgeTrafficDrop) firing: (2) 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop  - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop
[06:00:56] <jinxer-wm>	 (HAProxyEdgeTrafficDrop) resolved: (2) 66% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop  - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop
[07:40:56] <jinxer-wm>	 (HAProxyEdgeTrafficDrop) firing: 65% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop
[07:45:56] <jinxer-wm>	 (HAProxyEdgeTrafficDrop) resolved: 68% request drop in text@eqiad during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqiad&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop
[07:56:08] <godog>	 bblack vgutierrez re: ncredir probe, should the service be non-paging altogether ? and/or the probedown page was useful/informative ?
[08:47:02] <wikibugs>	 10Traffic, 10SRE, 10Patch-For-Review, 10Upstream: HAProxy 2.4.16 shows internal errors on text cluster - https://phabricator.wikimedia.org/T307444 (10Vgutierrez) Issue fixed by upstream on https://git.haproxy.org/?p=haproxy-2.4.git;a=commit;h=f9a0f51d3bfa37993935754508e7c88b2e69c9ed
[16:20:59] <wikibugs>	 10Traffic, 10Analytics, 10Data-Engineering, 10Data-Engineering-Kanban, and 3 others: Maxmind: GeoIP Download Failed - https://phabricator.wikimedia.org/T302864 (10Dzahn)
[17:07:15] <bblack>	 godog: yeah, I don't think it should be paging.  it's fine to be an IRC alert though!
[17:21:03] <cdanis>	 been keeping an eye on various dashboards after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/789219
[17:21:47] <cdanis>	 things look good overall -- there's a noticeable increase in CPU in esams but it's from ~0.8sec/sec to ~1.3sec/sec so it's not exactly concerning :)
[17:21:59] <cdanis>	 interestingly it also ... seemed to decrease ttfb on that host?
[17:23:37] <bblack>	 cdanis: first I've seen of it, is there anything more than the ticket?
[17:23:59] <cdanis>	 bblack: a bunch of discussions with Valentin, but that's basically it
[17:24:04] <cdanis>	 I'll expand the ticket today
[17:24:37] <bblack>	 ok thanks! :)
[17:25:13] <cdanis>	 the tldr is that, when we do see cachebusting attacks, they tend to come from a small number of IP addresses relative to overall traffic (on the order of hundreds or single-digit thousands) -- so identifying abusive behavior seems doable
[17:25:44] <bblack>	 are there legit patterns that get caught in the crossfire?
[17:25:57] <cdanis>	 that patch has haproxy internally track, for each client IP, # of miss+pass/10 seconds, and # of new TCP connections/10 seconds
[17:26:01] <bblack>	 eh maybe I shouldn't ask, I'm really not prepared to deep-dive on this at the moment! :)
[17:26:35] <cdanis>	 it doesn't do anything with the data yet, aside from I am dumping them to local disk every minute or so, and later we can start thinking up thresholds (which we'd only act upon if we knew we were saturating on traffic)
[17:28:06] <cdanis>	 of course I really want any future state of this to not impact legit traffic, but I also think you can argue that it isn't *so* bad if the impact is only during windows where we would be suffering anyway, and overall this makes those windows rarer and shorter
[17:28:43] <cdanis>	 but as for now, assuming I didn't introduce a glaring performance regression into these four machines around the fleet (which it looks like I definitely did not), there's no impact on any traffic
[17:29:39] <bblack>	 sounds pretty useful anyways!
[17:31:27] <_joe_>	 cdanis: that's fantastic, I was worried esams traffic would expose any bottleneck in performance, apparently though there is none
[17:31:55] <cdanis>	 yeah I'm really happy about that, it's like a 75% increase in CPU usage from "very small" to "small"
[17:36:08] <wikibugs>	 10Traffic, 10Privacy Engineering, 10Research, 10SRE, and 3 others: wikiworkshop.org has Facebook button, external statcounter, https to http redirect - https://phabricator.wikimedia.org/T251732 (10BBlack) >>! In T251732#7892986, @bmansurov wrote: > 2. Resolve the https -> http redirect issue (who should lo...
[17:36:33] <cdanis>	 memory consumption doesn't look appreciable at all, either -- even in esams we're using < 50k entries in a table where each entry takes ~60 bytes
[17:36:38] <cdanis>	 so.. under 3MB?
[17:37:25] <cdanis>	 (that's the `newconnrate` table, which is exactly what it sounds like; there's also a smaller `misspassrate` table that is, as you might guess, approximately 20-25% of the size of the former one)