[07:52:11] topranks, XioNoX question for you guys... 2002::/16 can be considered the same as 2002:0000::/16? [07:52:43] we got a linter rejecting 2002::/16 with '2002::/16' covers illegal IPv4-like space [07:53:16] but it looks to me that the error is due to 2002:0000::/16 being used for 6to4 [07:55:04] hmm yeah sounds like the linter is wrong alright [07:55:07] yes they are the exact same [07:55:12] https://www.irccloud.com/pastebin/vOe6U41r/ [07:55:17] thx [07:55:22] fabfur, _joe_ ^^ [07:55:58] 👍 thanks [07:56:19] which linter? [08:00:18] volans: the one used by conftool [08:00:57] probably because it's using some IPv6 class that's configured to reject 6to4 IP space [08:15:57] 06Traffic: confd linter complains about invalid addresses - https://phabricator.wikimedia.org/T394474 (10Fabfur) 03NEW [08:18:08] <_joe_> vgutierrez: is it though? [08:18:22] <_joe_> I thought it was haproxy complaining, not confd [08:18:34] <_joe_> err varnish [08:18:41] <_joe_> so the netmapper vmod [08:18:46] <_joe_> which we wrote IIRC [08:19:33] what’s the background here? we want to discard all traffic from that range? [08:19:58] I’ll need to double check but it’s effectively a bogon range (protocol is dead too). [08:20:11] so there is an argument it should be blocked at the core router edge [08:20:20] topranks: the task I just opened ^^ (not so much context but we can elaborate on this) [08:21:24] ok thanks [08:22:19] it started as a linter error but we're now asking if it's ok to entirely exclude the whole prefix from the script that imports all cloud prefixes [08:26:11] I think it is, let me look into it though [08:26:32] I’ve a doctors appt I’ll follow up on the task later on [08:26:46] That said probably the linter issue is worth fixing either way [08:27:29] _joe_: nope.. is the linter configured to check the generated json file before deploying it [08:54:34] 06Traffic, 10Data-Platform, 06Experimentation Lab, 10Data-Platform-SRE (2025.05.02 - 2025.05.23): Include all CDN SANs on eventgate-analytics-external.discovery.wmnet:4692 TLS certificate - https://phabricator.wikimedia.org/T394437#10828980 (10Gehel) [08:54:39] 06Traffic, 10Data-Platform, 06Experimentation Lab, 10Data-Platform-SRE (2025.05.02 - 2025.05.23): Include all CDN SANs on eventgate-analytics-external.discovery.wmnet:4692 TLS certificate - https://phabricator.wikimedia.org/T394437#10828981 (10Gehel) p:05Triage→03Medium [08:56:12] 06Traffic, 13Patch-For-Review: confd linter complains about invalid addresses - https://phabricator.wikimedia.org/T394474#10828992 (10Fabfur) 05Open→03In progress A fix that should prevent this specific case has been deployed with https://gerrit.wikimedia.org/r/c/operations/puppet/+/1146934 [09:14:05] 06Traffic, 13Patch-For-Review: confd linter complains about invalid addresses - https://phabricator.wikimedia.org/T394474#10829062 (10Vgutierrez) 05In progress→03Resolved [09:21:57] 06Traffic, 13Patch-For-Review: confd linter complains about invalid addresses - https://phabricator.wikimedia.org/T394474#10829095 (10cmooney) So this is basically a bogon range for us and we should be filtering it on the edge of our network. There is configuration in place at all our sites to do this, ho... [09:22:12] 06Traffic, 13Patch-For-Review: confd linter complains about invalid addresses - https://phabricator.wikimedia.org/T394474#10829096 (10cmooney) 05Resolved→03Open [09:26:34] thanks topranks for working on this! Do we want to keep our exclude list on the script that fetches the cloud prefixes as extra safe layer? [09:27:10] fabfur: np, good it came up seems we made a slight mistake blocking this when it was first done [09:27:35] I don't think there is any need for us to have this filtered anywhere else, it will always be blocked at the core routers [09:27:41] https://gerrit.wikimedia.org/r/1146941 definitely is out of the scope of that task BTW [09:27:54] I would say do whatever is easiest [09:28:22] vgutierrez: yeah I'm just being lazy hijacking it [09:28:35] it's not a matter of filtering or not but if we don't exclude 2002::/16 it basically breaks our netmapper pipeline [09:29:14] and I don't see any good reason to treat an 6to4 prefix as bad by default [09:35:01] 06Traffic: Consider using a dedicated TLS certificate for upload.w.o - https://phabricator.wikimedia.org/T394484 (10Vgutierrez) 03NEW [09:37:43] vgutierrez: there seems to be some difference of opinion on whether it should properly be considered a bogon or not [09:38:15] the anycast version of the protocol - as per RFC7526 - is officially deprecated [09:38:28] but not the original range in RFC3056 [09:38:53] nevertheless from an operational perspective nobody uses RFC3056 and I think any traffic from it is more likely to be malicious/junk than anything else [09:39:12] so I'm not inclined to change or config (which predates me being here) of blocking this range [09:39:17] *our config [09:41:19] 06Traffic: confd linter complains about invalid addresses - https://phabricator.wikimedia.org/T394474#10829176 (10cmooney) 05Open→03Resolved Ok I've rolled out the fix, the full range is now also blocked: ` cmooney@re0.cr1-esams> show route table inet6.0 2002::/16 terse active-path inet6.0: 218434 dest... [09:55:14] 06Traffic: Consider using a dedicated TLS certificate for upload.w.o - https://phabricator.wikimedia.org/T394484#10829221 (10Vgutierrez) [10:49:08] vgutierrez b.black it's okay to start unsetting X-Experiment-Enrollments at the edge. eventgate-wikimedia has been deployed for the external analytics eventgate and shown to be working. [10:49:43] dr0ptp4kt: could you give your +1 on https://gerrit.wikimedia.org/r/c/operations/puppet/+/1143608 :? [10:50:16] thx <3 [10:51:25] done vgutierrez . heads up, i'll be at a data conference today, but will be opening laptop to look about on occasion at things for my ops week rotation and see if there's any motion on the tls cert SAN stuff. don't know that i'll get much time to do anything other than ops week though during those breaks [10:51:28] thx! [10:52:38] no problem [10:53:35] 06Traffic, 06Experimentation Lab, 13Patch-For-Review: SDS 2.4.4 Edge Uniques Production Cookie Deployment - https://phabricator.wikimedia.org/T391411#10829410 (10Vgutierrez) [11:00:42] 06Traffic, 10Data-Platform, 06Experimentation Lab, 10Data-Platform-SRE (2025.05.02 - 2025.05.23): Include all CDN SANs on eventgate-analytics-external.discovery.wmnet:4692 TLS certificate - https://phabricator.wikimedia.org/T394437#10829429 (10BTullis) p:05Medium→03High a:03BTullis [11:19:04] 10netops, 06Infrastructure-Foundations, 10Observability-Alerting, 13Patch-For-Review: Migrate network icinga alerts to gNMI/prometheus - https://phabricator.wikimedia.org/T388641#10829481 (10cmooney) >>! In T388641#10628908, @gerritbot wrote: > Change #1127041 had a related patch set uploaded (by Ayounsi;... [12:56:48] 06Traffic: Deb package for github.com/fabled/lua-maxminddb - https://phabricator.wikimedia.org/T394504 (10Fabfur) 03NEW [13:13:18] 06Traffic: Consider using a dedicated TLS certificate for upload.w.o - https://phabricator.wikimedia.org/T394484#10829851 (10Vgutierrez) p:05Triage→03Medium [15:10:05] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830355 (10Milimetric) p:05Low→03High [15:27:25] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830415 (10mforns) Looking into this. [15:36:19] 06Traffic, 10Data-Platform, 10Data-Platform-SRE (2025.05.02 - 2025.05.23), 10Experimentation Lab (Experiment Platform Sprint 6): Include all CDN SANs on eventgate-analytics-external.discovery.wmnet:4692 TLS certificate - https://phabricator.wikimedia.org/T394437#10830447 (10dr0ptp4kt) [15:41:45] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830467 (10Vgutierrez) it looks like not only WMF-Last-Access-Global is impacted by this: ` vgutierrez@carrot:~$ curl -v "ht... [15:44:19] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830494 (10Vgutierrez) Varnish seems to be rewriting the host header from commons.wikimedia.org to en.wikipedia.org: ` - R... [15:49:52] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830521 (10Vgutierrez) this is triggered by the following VCL logic: ` # normalize all /static to the same hostname... [15:49:54] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830522 (10mforns) Thank you @Vgutierrez! It makes sense that the issue is not in the uniques code, since the Cookie request... [15:51:03] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: setup MPC10E-10C and SCBE3 - https://phabricator.wikimedia.org/T393552#10830526 (10cmooney) [15:56:40] 06Traffic, 06[Archived]Wikidata Dev Team, 10Prod-Kubernetes, 06SRE, and 3 others: Frequent 500 Errors and Timeouts When Adding Statements to New Item or Lexeme-typed Properties - https://phabricator.wikimedia.org/T374230#10830532 (10Silvan_WMDE) \o/ [15:57:45] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830533 (10mforns) > this is triggered by the following VCL logic: > ` > # normalize all /static to the same hostname for ca... [15:59:04] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830535 (10Vgutierrez) >>! In T367346#10830533, @mforns wrote: >> this is triggered by the following VCL logic: >> ` >> # no... [16:09:30] 10netops, 06Infrastructure-Foundations, 06SRE: Homer: redefine IBGP clusters to support Unicast & EVPN - https://phabricator.wikimedia.org/T394530 (10cmooney) 03NEW [16:09:53] 10netops, 06Infrastructure-Foundations, 06SRE: Homer: redefine IBGP definitions to support both Unicast & EVPN clusters - https://phabricator.wikimedia.org/T394530#10830559 (10cmooney) [16:11:41] 10netops, 06Infrastructure-Foundations, 06SRE: Homer: redefine IBGP definitions to support both Unicast & EVPN clusters - https://phabricator.wikimedia.org/T394530#10830566 (10cmooney) p:05Triage→03Medium [16:13:46] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830573 (10mforns) Oh, thanks a lot for finding this @Vgutierrez! So, we know the root cause. But it seems that, if we fixe... [16:19:27] 06Traffic, 06Data-Engineering, 06Infrastructure-Foundations: WMF-Last-Access-Global cookie set on wrong domain when accessing static assets - https://phabricator.wikimedia.org/T367346#10830580 (10Vgutierrez) >>! In T367346#10830573, @mforns wrote: > Oh, thanks a lot for finding this @Vgutierrez! > > So, we... [17:02:41] 10netops, 06DC-Ops, 06Infrastructure-Foundations, 10ops-codfw, 06SRE: codfw: setup MPC10E-10C and SCBE3 - https://phabricator.wikimedia.org/T393552#10830688 (10Papaul) All the steps looks good to me thanks. [18:34:54] 06Traffic, 06Data-Engineering, 06Data-Engineering-Icebox, 06SRE, and 3 others: Requests for /static get an invalid WMF-Last-Access cookie for wikipedia.org on non-Wikipedia requests - https://phabricator.wikimedia.org/T261803#10831043 (10Krinkle) [18:44:55] 06Traffic, 07Developer Productivity: Skip pageview cookies on w.wiki domain to address "bounce tracker" warnings - https://phabricator.wikimedia.org/T394540 (10Krinkle) 03NEW