[01:08:33] 10Traffic, 10MediaWiki-Stakeholders-Group, 10SRE, 10Wikipedia-Android-App-Backlog, 10Performance-Team (Radar): RFC: API-driven web front-end - https://phabricator.wikimedia.org/T111588 (10Renoirb) This has been closed? Has an equivalent idea started under a different name? [05:49:56] (EdgeTrafficDrop) firing: 67% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [06:04:56] (EdgeTrafficDrop) resolved: 68% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [08:48:37] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp1078.eqiad.wmnet with OS buster [09:30:31] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp1078.eqiad.wmnet with OS buster com... [10:00:47] 10Traffic, 10DC-Ops: cp1090.mgmt ssh port not accessible - https://phabricator.wikimedia.org/T304589 (10MMandere) [10:19:13] 10Traffic, 10DC-Ops, 10ops-eqiad: cp1090.mgmt ssh port not accessible - https://phabricator.wikimedia.org/T304589 (10MMandere) p:05Triage→03Medium [10:27:24] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp1076.eqiad.wmnet with OS buster [11:10:13] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp1076.eqiad.wmnet with OS buster com... [15:39:47] 10Traffic, 10Data-Engineering: Lock-in Varnish and VarnishKafka versions - https://phabricator.wikimedia.org/T304617 (10odimitrijevic) [15:52:53] 10Traffic, 10Data-Engineering, 10SRE: Lock-in Varnish and VarnishKafka versions - https://phabricator.wikimedia.org/T304617 (10elukey) Adding some context for the Traffic team. There were two varnishkafka versions, one in the `main` component and one in `component/varnish6` of `buster-wikimedia` at the time... [15:59:32] 10Traffic, 10Data-Engineering, 10SRE: Lock-in Varnish and VarnishKafka versions - https://phabricator.wikimedia.org/T304617 (10BBlack) Thanks for making this ticket and adding those insights! I agree, there have been multiple times in the past that we've had problems in this area, and we should probably pup... [16:28:15] XioNoX: any blocker for moving on with ES now? I had a power outage yesterday that blocking doing it at the "best" (lowest-traffic) time, but honestly it's probably not a huge diff either way. [16:28:26] sorry, context, drmrs :) [16:29:06] bblack: no blocker, but I have to step away in 10min [16:29:27] ok [16:29:52] I can't imagine just ES will saturate anything, but will keep an eye out in case! [16:31:42] cool! [16:44:26] all looks pretty normal/expected in the first few minutes, seeing a notable bump across all the transits, not much transport (cache hit rate still decent), L7 graphs show expected uptick, etc. [16:55:31] almost a full gigabit of outbound transit now, ~880Mbps roughly by librenms graphs [16:56:17] transport did eventually about double, but not bad at all [16:58:23] seems to have reached peak-ish around 935Mbps transit out after the ES traffic ramped in and levelled off for a few mins [16:58:35] (split across 3x transits) [17:05:22] 10Traffic, 10SRE, 10envoy, 10serviceops, 10Patch-For-Review: Refactor envoy HTTP protocol options to new version - https://phabricator.wikimedia.org/T303230 (10RLazarus) 05In progress→03Resolved [17:05:30] 10Traffic, 10SRE, 10envoy, 10serviceops, 10Patch-For-Review: Upgrade Envoy to supported version - https://phabricator.wikimedia.org/T300324 (10RLazarus) [17:45:48] 10Traffic, 10Data-Engineering-Radar, 10SRE: Lock-in Varnish and VarnishKafka versions - https://phabricator.wikimedia.org/T304617 (10EChetty) [18:08:32] 10Traffic, 10SRE, 10envoy, 10serviceops, 10Patch-For-Review: Refactor envoy HTTP protocol options to new version - https://phabricator.wikimedia.org/T303230 (10RLazarus) 05Resolved→03In progress [20:07:30] 10Traffic, 10Data-Engineering-Radar, 10SRE: Lock-in Varnish and VarnishKafka versions - https://phabricator.wikimedia.org/T304617 (10odimitrijevic) [20:14:10] 10Traffic, 10Data-Engineering, 10Data-Engineering-Kanban: Spike: Investigate creating robust alerts to notify that caching nodes are not sending traffic data - https://phabricator.wikimedia.org/T304651 (10odimitrijevic) [21:02:06] 10Traffic, 10netops, 10Infrastructure-Foundations, 10SRE, 10Patch-For-Review: drmrs: initial geodns configuration - https://phabricator.wikimedia.org/T304089 (10BBlack)