[13:21:14] good morning all and welcome back _joe_ [13:38:13] !oncall-now [13:38:14] Oncall now for team SRE, rotation business_hours: [13:38:14] b.black, E.mperor [13:38:31] bblack, emperor: I'm depooling text@ulsfo before enabling IPIP encapsulation there [13:40:23] thanks for the heads-up [14:22:18] Scap is failing with `Location is not a git repo: '/srv/mediawiki-staging'` [14:38:24] Sorted it. [16:09:08] jhathaway: can I get a quick +1 for https://gerrit.wikimedia.org/r/c/operations/puppet/+/1038381 ? (You inherited the puppet 7 migration, right?) [16:11:13] andrewbogott: looking... [16:11:47] thanks. That's a too-late followup to https://gerrit.wikimedia.org/r/c/operations/puppet/+/1037812 [16:11:49] andrewbogott: +1ed [16:13:08] thank you! [16:24:01] jhathaway: one more dumb followup https://gerrit.wikimedia.org/r/c/operations/puppet/+/1038387 [16:30:18] andrewbogott: looking [16:36:53] jhathaway: comment added [17:05:45] jhathaway: > spf=softfail (google.com: domain of transitioning no-reply@phabricator.wikimedia.org does not designate 2620:0:861:102:10:64:16:101 as permitted sender) [17:06:10] which I guess is expected because of f4372332f95021d3859dcd67fa3d021821c4f77e [17:06:36] but I just wanted to share because of that, Gmail is now complaining about the phab email not being verified [17:06:42] hmm is this a recent message sukhe? [17:08:20] sukhe: can you download the message and forward it to me? [17:08:27] jhathaway: yeah in the recent one hour [17:09:11] also 2620:0:861:102:10:64:16:101 is not in _cidrs and the v6 subnets [17:10:06] yep, sure. let me do it [17:11:17] sukhe: you are correct, which was a miss on my behalf [17:11:23] actually I think it should have been 2620:0:861:100::/56 [17:11:34] or the larger /48 but I think that's not nice since it might include some subnets we don't want [17:14:54] sukhe: I clearly didn't look into the public vs private vlan allocations enough for ipv6, I was just looking at the netbox data [17:15:11] any suggestions on how to obtain a better understanding? [17:18:58] jhathaway: I think it gets a bit tricky because of how we treat cloud hosts (in my understanding) but looking at your change [17:19:02] we had ip6:2620:0:860::/46 [17:19:10] which we replaced with ip6:2620:0:860::/56 ip6:2620:0:861::/56 [17:19:19] was the idea to make it more restrictive? [17:19:24] yes [17:21:10] https://netbox.wikimedia.org/ipam/prefixes/281/prefixes/ [17:21:36] hmm thinking [17:23:43] the question I guess is if like you say in commit message that we do want WMCS to send email or not (I guess a separate question if they are actually sending?) [17:25:00] I don't believe they send mail from the wikimedia.org domain [17:25:31] so I was trying to prevent using WMCS hosts for sending wikimedia.org email [17:25:40] yeah, makes sense [17:26:50] the question is how to divide it though in this. this definitely calls for netops I think; see https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1020202 on which I got some fedeback from Cathal when trying to do this for mediawiki-config [17:28:51] yup that is helpful [17:29:05] though if we look at it, 2620:0:861:118::/64 which is cloud-hosts1-eqiad is a subnet of 2620:0:860::/46, which is what we had before [17:29:09] but that's not desirable clearly [17:30:39] which is why even in mediawiki-config, we had to lay them out separately [17:31:39] what does private in the vlan mean in this context, why is phab's ip in private1-b-eqiad? [17:31:42] jhathaway: maybe how about we specify the /46, ip6:2620:0:860::/46 so that we don't have to keep track of the phab IPs but exclude the cloud VLANs with -ip4 and -ip6? [17:32:21] jhathaway: basically because it's phab1004.eqiad.wmnet [17:32:42] the actual host Received: from phab1004.eqiad.wmnet ([2620:0:861:102:10:64:16:101]:45715) by mx1001.wikimedia.org with esmtp (Exim 4.94.2) [17:32:47] right [17:33:21] I think then the first question is why google is looking at the transitioning server [17:33:47] I wasn't aware they would do that [17:34:33] I wonder if the result is the same if the email is sent to another domain, rather than wikimedia.org, it may have something to do with a setting on gmail's side [17:38:15] jhathaway: not sure but spitballing here: maybe because we still set ~all and not -all and it still thinks we are under the transition? [17:38:20] sukhe: I can confirm that when sending to my test external gmail address, spf passes [17:39:23] jhathaway: interesting [17:39:33] where did you send the email from? [17:39:52] I sent a test message with swaks from phab1004 [17:39:58] ah ok [17:40:05] swaks -s localhost -f no-reply@phabricator.wikimedia.org -t akathelollipopman@gmail.com -b 'ticket #4' [17:44:40] jhathaway: not that I know anything about this but to me it seems the soft-fail for ~all and the IP not being in the subnet is the issue here. maybe if we can fix that and then revert to -all it should be fine? [17:44:48] that still doesn't explain though why the email to your address works :) [17:47:17] My understanding is that SPF should only check the last hop ip address, because that address is verified by the tcp/ip connection on the inbound email server, any other address could be trivially spoofed. [17:47:41] However gmail is checking phab's address, even though it was not the last hop, which is strange [17:51:58] jhathaway: I found this, not sure how relevant https://support.google.com/a/answer/60730?hl=en [17:52:06] > How Gmail determines the source IP address: Gmail scans Received: from message headers to find the first public IP address that’s not in the Gateway IP list. Gmail treats this IP address as the source IP for the message. [17:52:10] > When this option is off, Gmail checks only one hop backwards for the sending IP address. [17:53:26] sukhe: thanks, I think that is the issue [17:54:06] is it possible the difference between you sending a test mail from phab1004 and actual phabricator sending mail is that there is this intermediary transport script [17:54:08] which I guess explains why it sees phab1004 [17:54:21] /usr/local/bin/phab_epipe.py [17:56:08] mutante: I don't think so, if I send an email to my wikimedia.org address, using swaks, I get the softfail as well [17:56:23] ack, nevermind then [17:56:46] thanks though [17:56:58] does it mean it's now a question for ITS? [17:57:37] well once we cut back over to the new outbound postfix servers, I think this issues goes away, since the flow will be: [17:58:02] phab1004 -> mx-out1001 -> mx1001 -> gmail, and mx-out1001 will pass spf [17:58:15] so that is another route to fixing the issue [17:58:17] ah, got it! [18:00:02] thanks for the help, stepping away for lunch, then I'll prioritze getting phab going through mx-out again [18:00:33] jhathaway: <3