[00:57:31] 10Traffic, 10SRE: Upgrading Wikidough and durum VMs to bullseye - https://phabricator.wikimedia.org/T305589 (10ssingh) >>! In T305589#7836863, @Dzahn wrote: > My 2 cents: Thanks for the feedback! > cookbook not worth it in this case, likely more work to create and debug it than the actual time savings with i... [05:52:56] (HAProxyEdgeTrafficDrop) firing: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [05:57:56] (HAProxyEdgeTrafficDrop) resolved: 69% request drop in text@codfw during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=codfw&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [07:48:57] (EdgeTrafficDrop) firing: 39% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [07:56:04] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp3050.esams.wmnet with OS buster [08:03:57] (EdgeTrafficDrop) resolved: 0% request drop in text@esams during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=esams&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DEdgeTrafficDrop [08:09:57] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6014.drmrs.wmnet with OS buster [08:56:51] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp3050.esams.wmnet with OS buster com... [09:07:14] 10Traffic, 10SRE: Upgrading Wikidough and durum VMs to bullseye - https://phabricator.wikimedia.org/T305589 (10fgiunchedi) Thanks @ssingh for kickstarting the discussion! My two cents as an owner (with o11y) of some VMs that will need upgrading (grafana, logstash, etc): I think our strategy when it comes to l... [09:16:11] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6014.drmrs.wmnet with OS buster com... [09:33:48] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp3053.esams.wmnet with OS buster [09:39:41] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6006.drmrs.wmnet with OS buster [10:37:32] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6006.drmrs.wmnet with OS buster com... [10:55:15] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp3053.esams.wmnet with OS buster com... [11:01:20] 10Traffic, 10SRE, 10WMF-Legal, 10Performance-Team (Radar), 10Privacy: Consider disabling Chrome Lite pages for Wikipedia on Chrome on mobile with Cache-Control: no-transform - https://phabricator.wikimedia.org/T218618 (10Nicholas_Perry) Hi all, we received some info from Google which may help inform this... [11:34:29] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp3051.esams.wmnet with OS buster [11:45:49] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6013.drmrs.wmnet with OS buster [12:13:21] 10Traffic, 10SRE: Upgrading Wikidough and durum VMs to bullseye - https://phabricator.wikimedia.org/T305589 (10Volans) >>! In T305589#7837526, @fgiunchedi wrote: > AIUI the decom cookbook doesn't support VMs yet (?) That's not actually correct, the decommission cookbook does support VMs since the start. What... [12:30:25] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp3051.esams.wmnet with OS buster com... [12:37:07] 10Traffic, 10SRE: Upgrading Wikidough and durum VMs to bullseye - https://phabricator.wikimedia.org/T305589 (10fgiunchedi) >>! In T305589#7837933, @Volans wrote: >>>! In T305589#7837526, @fgiunchedi wrote: >> AIUI the decom cookbook doesn't support VMs yet (?) > > That's not actually correct, the decommission... [12:52:15] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6013.drmrs.wmnet with OS buster com... [12:54:59] !log pool cp6013 with HAProxy as TLS termination layer - T290005 [12:55:01] Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log [12:55:02] T290005: Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 [13:12:18] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6005.drmrs.wmnet with OS buster [13:21:01] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6012.drmrs.wmnet with OS buster [13:51:25] 10Traffic, 10SRE: Upgrading Wikidough and durum VMs to bullseye - https://phabricator.wikimedia.org/T305589 (10ssingh) Thanks for the feedback @fgiunchedi and @Volans! >>! In T305589#7837933, @Volans wrote: >>>! In T305589#7837526, @fgiunchedi wrote: >> AIUI the decom cookbook doesn't support VMs yet (?) > >... [14:06:52] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6005.drmrs.wmnet with OS buster com... [14:11:01] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6012.drmrs.wmnet with OS buster com... [14:24:25] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6004.drmrs.wmnet with OS buster [15:12:43] 10Traffic, 10SRE, 10Patch-For-Review, 10Performance-Team (Radar): Test haproxy as a WMF's CDN TLS terminator with real traffic - https://phabricator.wikimedia.org/T290005 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6004.drmrs.wmnet with OS buster com... [17:22:04] Hi folks - I have a question on how our traffic is routed between our 3 caching layers in datacenters - I hope I'm at the right place to ask [17:56:48] joal: Should be as good as any :) [18:25:56] (HAProxyEdgeTrafficDrop) firing: 60% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [18:30:56] (HAProxyEdgeTrafficDrop) resolved: 65% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [18:57:20] My understanding is that requests are sent to our TLS layer trying to keep user on the same hosts, and that each of the TLS-layer host sends its requests to a single varnish-front-end host, to finally end up to the ATS-disk layer depending on the patge requested - Am I about right? [19:37:56] (HAProxyEdgeTrafficDrop) firing: 61% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [19:42:56] (HAProxyEdgeTrafficDrop) resolved: 64% request drop in text@eqsin during the past 30 minutes - https://wikitech.wikimedia.org/wiki/Monitoring/EdgeTrafficDrop - https://grafana.wikimedia.org/d/000000479/frontend-traffic?viewPanel=12&orgId=1&from=now-24h&to=now&var-site=eqsin&var-cache_type=text - https://alerts.wikimedia.org/?q=alertname%3DHAProxyEdgeTrafficDrop [20:25:55] joal: that at least matches https://wikitech.wikimedia.org/wiki/Caching_overview. [21:43:04] joal: I'm not 100% sure but I believe: LVS hashes based on client IP to an ats-tls or haproxy, then that goes to Varnish on the same machine, and then from there there's hashing based on the URL to pick a ATS-BE host [21:46:24] yes that's basically-correct [21:46:45] the final step (varnish -> ats-be hashing on the URL), though, is somewhat conditional [21:47:12] in common cacheable cases that's true, but we also have some logic that tries to explicitly avoid chashing on that step for some uncacheable traffic [21:47:22] (because chashing uncacheables can overload specific ats-be instances) [21:57:25] 10netops, 10Infrastructure-Foundations, 10SRE: Configure cloudsw1-e4-eqiad and cloudsw1-f4-eqiad - https://phabricator.wikimedia.org/T304936 (10cmooney)