[14:18:33] I'm running into what feels like a firewall or security group issue, but I'm not seeing what's wrong in the config. [14:18:46] `curl deployment-webperf21.deployment-prep.eqiad1.wikimedia.cloud` works from shell on deployment-cache-text06 [14:19:11] but fails from shell on deployment-mediawiki12 [14:19:30] curl -v shows: [14:19:32] * Expire in 0 ms for 1 (transfer 0x5637938bb110) [14:19:32] * Expire in 0 ms for 1 (transfer 0x5637938bb110) [14:19:32] * Trying 172.16.6.76... [14:19:32] * TCP_NODELAY set [14:19:32] * Expire in 200 ms for 4 (transfer 0x5637938bb110) [14:19:35] (for the failing one) [14:19:57] it hangs indefintely [14:20:13] where it works: [14:20:32] * Expire in 0 ms for 1 (transfer 0x55bb1f10f110) [14:20:32] * Trying 172.16.6.76... [14:20:32] * TCP_NODELAY set [14:20:32] * Expire in 200 ms for 4 (transfer 0x55bb1f10f110) [14:20:32] * Connected to deployment-webperf21.deployment-prep.eqiad1.wikimedia.cloud (172.16.6.76) port 80 (#0) [14:20:33] > HEAD / HTTP/1.1 [14:20:33] … [14:20:34] HTTP/1.1 200 OK [14:21:32] webperf has security group "web" which `ALLOW IPv4 80/tcp from 0.0.0.0/0` - seems like that ought to do it [14:24:12] It fails from deployment-puppetmaster04 as well [15:27:49] !log tools.lexeme-forms deployed 934f5cffdb (Yoruba adjectives) [15:27:53] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [15:39:49] Krinkle: in addition to network-level security groups, that instance has ferm/puppet-managed iptables rules restricting access to port 80 [16:01:56] taavi: ah, right. and the cert it uses is not for its fqdn but for performance.eqiad.wmnet [16:02:19] which makes it non-trivial to use within beta since that doesn't proxy to or resolve within labs [16:02:30] although in prod we can use that reach that HTTPS internally [16:02:46] performance.discovery.wmnet* [16:04:37] that seems to come from https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/refs/heads/master/deployment-prep/deployment-webperf21.deployment-prep.eqiad1.wikimedia.cloud.yaml#2, changing that should fix that issue [16:44:11] ack, thanks! [16:46:18] ah, I did see the ferm rule for port 80 in webperf/site.pp, I glossed right over the srange => '$CACHES', limitation [16:46:22] that makes perfect sense [16:47:12] I guess we never added that for 443, so this effectively is a beta-only thing now [16:47:15] * Krinkle checks [16:48:05] yeah, prod mwdebug1002 reaches httpS://performance.discovery.wmnet fine [16:51:39] * Krinkle changes to '%{facts.fqdn}' and moves to Horizon's prefix puppet [16:55:04] !log deployment-prep Fix profile::tlsproxy::envoy::global_cert_name in Horizon for webperf host to use '%{facts.fqdn}' instead of performance.discovery.wmnet as the latter doesn't resolve / would be an invalid cert for https://deployment-webperf21, ref T291015 [16:55:11] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Deployment-prep/SAL [16:55:11] T291015: Add per-request flamegraph option to WikimediaDebug - https://phabricator.wikimedia.org/T291015 [18:00:33] !log tools.lexeme-forms deployed 75af96b851 (Punjabi adjectives) [18:00:37] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL [18:41:16] !log tools.lexeme-forms deployed 0f96d60736 (Punjabi adverbs) [18:41:19] Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.lexeme-forms/SAL