[13:25:35] 10Traffic, 10SRE: haproxy: work on systemd unit hardening (cp hosts) - https://phabricator.wikimedia.org/T323944 (10Vgutierrez) 05In progress→03Resolved a:03ssingh [13:34:02] 10Traffic, 10SRE: haproxy: work on systemd unit hardening (cp hosts) - https://phabricator.wikimedia.org/T323944 (10ssingh) Thanks to @Vgutierrez for taking care of the rollout of this. For posterity, the final result for now before we do more enhancements: ` ===== NODE GROUP =====... [13:56:50] 10Traffic, 10API Platform, 10SRE: Block non-browser requests that use generic user agent (UA) headers - https://phabricator.wikimedia.org/T319423 (10daniel) >>! In T319423#8385567, @Joe wrote: > FWIW we're banning more generic UAs via dynamic requestctl rules; our rule of thumb is to start rate-limiting requ... [13:58:30] 10Traffic, 10SRE: Deploy Wikidough: Experimental DNS-over-HTTPS (DoH) and DNS-over-TLS (DoT) public resolver - https://phabricator.wikimedia.org/T252132 (10Elitre) [15:08:09] 10Traffic, 10SRE: Deprecating the dns::auth role and moving authdns[12]001 to dns[12]001. - https://phabricator.wikimedia.org/T330670 (10ssingh) @ayounsi, @cmooney: Quick question about Junos OS: so we are planning to spread `ns0` over `dns100[123]` and `ns1` over `dns200[123]`, similar to how we are doing wit... [15:16:11] 10Traffic, 10Data-Engineering, 10Data-Persistence, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10ayounsi) [15:16:28] 10netops, 10Infrastructure-Foundations, 10SRE: eqiad/codfw virtual-chassis upgrades - https://phabricator.wikimedia.org/T327248 (10ayounsi) [15:17:15] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) [15:20:31] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) [15:21:54] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) Moved dbproxy1018 as it belongs to #cloud-services-team [15:22:14] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) [15:22:32] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) [15:22:57] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10MoritzMuehlenhoff) [15:24:29] 10Traffic, 10DBA, 10Data-Engineering, 10Discovery-Search, and 7 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10Marostegui) [15:44:24] 10Traffic, 10netops, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Q4/Q1:knams racking elevations & planning - https://phabricator.wikimedia.org/T331886 (10RobH) [15:51:09] 10Traffic, 10SRE: Deprecating the dns::auth role and moving authdns[12]001 to dns[12]001. - https://phabricator.wikimedia.org/T330670 (10ayounsi) Yep that should be enough as both hosts are directly reachable by the router (they're in row A/B/D). We will need to look closely at them the day they're behind L3 s... [15:55:24] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10ayounsi) I noticed that it's running `19.1R3-S2.3` we should upgrade it to latest Junos recommended bef... [15:58:11] 10netops, 10Infrastructure-Foundations, 10SRE, 10cloud-services-team (FY2022/2023-Q3): Configure cloudsw1-b1-codfw and migrate cloud hosts in codfw B1 to it - https://phabricator.wikimedia.org/T327919 (10cmooney) >>! In T327919#8688051, @ayounsi wrote: > I noticed that it's running `19.1R3-S2.3` we should... [16:26:12] 10Traffic, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 8 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10MPhamWMF) [16:36:48] 10Traffic, 10netops, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Q4/Q1:knams racking elevations & planning - https://phabricator.wikimedia.org/T331886 (10RobH) [16:52:57] hello, I've a weird question for you :) [16:54:54] investigating the Sat. night page I noticed that in some cases we have the uri_path field in webrequest_sampled_128 that doesn't start with slash '/' and has weird values, that honesly look like they come from other fields (like uri_host, http version, etc.). Moreover if I try those uri_path they of course give me 404, but they are instead logged as 301, 200, etc... like normal requests [16:55:24] is it possible that for some weird reason the logged fields are "mismatched"? Like reading in the wrong place in a memory map or something like that [16:55:49] See some examples here: https://w.wiki/6SJz [16:58:28] I found them also on centrallog: grep 'uri_path":"[^/]' /srv/weblog/webrequest/archive/sampled-1000.json-20230313 | less [16:58:48] 10Traffic, 10netops, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Q4/Q1:knams racking elevations & planning - https://phabricator.wikimedia.org/T331886 (10ayounsi) Note that we will only have one mx480 (current cr3-esams), the other router will be the current cr3-knams (mx204). Future refresh wil... [16:58:59] * grep 'uri_path":"[^/]' /srv/weblog/webrequest/sampled-1000.json | less [16:59:06] for the generic one, I was looking at a specific date and time [16:59:18] in the varnishkafka config we use %{@uri_path}U, that corresponds to [16:59:19] https://github.com/wikimedia/operations-software-varnish-varnishkafka/blob/master/varnishkafka.c#L888-L891 [16:59:38] so we read from Varnish's SLT_ReqURL field [17:00:17] and we parse it with https://github.com/wikimedia/operations-software-varnish-varnishkafka/blob/master/varnishkafka.c#L524 [17:01:31] <3 elukey [17:02:42] 10Traffic, 10SRE: Deprecating the dns::auth role and moving authdns[12]001 to dns[12]001. - https://phabricator.wikimedia.org/T330670 (10BBlack) Resilient hashing indeed sounds much better (it seems like that's their codeword for some internal "consistent hashing" implementation), but it doesn't look like our... [17:04:30] it seems to me that the normal rate is very low, and then we have very high spikes: https://w.wiki/6SKA [17:04:38] like it happens only under certain circumstances [17:04:55] 10Traffic, 10netops, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Q4/Q1:knams racking elevations & planning - https://phabricator.wikimedia.org/T331886 (10RobH) [17:06:37] should I open a task? is that known? [17:07:51] 10Traffic, 10netops, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Q4/Q1:knams racking elevations & planning - https://phabricator.wikimedia.org/T331886 (10RobH) >>! In T331886#8688402, @ayounsi wrote: > Note that we will only have one mx480 (current cr3-esams), the other router will be the current... [17:11:11] 10Traffic, 10SRE: Deprecating the dns::auth role and moving authdns[12]001 to dns[12]001. - https://phabricator.wikimedia.org/T330670 (10ayounsi) Good point! Looks like it's only for switches, not common! Compatible with the routers, there is `load-balance consistent-hash` but only for BGP peers: > (BGP only)... [17:16:53] 10Traffic, 10netops, 10DC-Ops, 10Infrastructure-Foundations, and 2 others: Q4/Q1:knams racking elevations & planning - https://phabricator.wikimedia.org/T331886 (10ayounsi) > In reviewing the contract, its "Precabling/Patc h Panels : Fiber – 6 Ports" so it's a bundle and likely has to terminate in the same... [19:24:17] 10Traffic, 10DBA, 10Data-Engineering, 10Infrastructure-Foundations, and 8 others: eqiad row C switches upgrade - https://phabricator.wikimedia.org/T331882 (10colewhite)