[09:00:54] vgutierrez: Do you have some time to advise on https://gerrit.wikimedia.org/r/c/operations/puppet/+/965056 ? We want to send wikifunctions to its own mw-on-k8s deployment and have a few questions wrt to adding path normalization, catching api calls on top of the general match, etc. (cc jayme) [09:04:13] * vgutierrez looking [09:05:32] <3 [09:08:57] claime: ok.. so where are the questions? :) [09:09:14] CR itself looks good.. DNS records aren't there yet I guess [09:09:23] and it should be type: map rather than regex_map [09:09:43] mentioned in the CR itself just to not forget about it [09:09:43] vgutierrez: Do we need specific http://wikifunctions.org/w/api.php and http://wikifunctions.org/w/rest.php [09:10:07] you wanna send those to a specific deployment? [09:10:13] or the generic is ok? [09:10:14] To that same specific deployment [09:10:21] so yes.. you need to catch that [09:10:46] ok, so what was done for wikidata for instance, would have just caught regular traffic [09:10:49] Not api calls [09:11:00] that's based on the order [09:11:03] Aaaah [09:11:18] wikidata.org is defined after api.php and rest.php [09:11:46] if you move your definition before api.php and rest.php it should catch those as well [09:11:47] Last match wins? [09:11:53] first match wins [09:11:53] ok [09:12:45] so if you move your wikifunctions.org definition on top of rest.php and api.php it should get those requests too [09:12:46] so it's actually ok for wikifunctions, it'll catch api.php and rest.php calls where it is [09:12:53] Yeah it is before [09:12:57] uh? [09:12:58] wikidata isn't [09:13:06] no it it isn't [09:13:11] it is after wikidata.org [09:13:17] Uh [09:13:17] at least on the CR that you sent me [09:13:28] Ok I was looking at my file, must have moved it [09:13:35] :) [09:13:38] sorry [09:14:28] What about the path normalization stuff? Should that be added as well? [09:14:58] please do add an specific comment mentioning that's on top of (api|rest).php rules on purpose [09:15:06] ack [09:15:19] @pparam=/etc/trafficserver/lua/rb-mw-mangling.lua :? yes [09:15:43] '@pparam=/etc/trafficserver/lua/normalize-path.lua' [09:15:49] or it will lose X-Subdomain support [09:16:42] yep [09:16:45] that's needed as well [09:17:10] ack [09:33:40] claime: not like we are getting tons for PURGE requests against api.php but technically it could be an issue [09:33:46] s/for/of/ [09:34:04] yeah, adding another map block, np [09:55:26] vgutierrez: AFAICT katran-test is assigned only in eqiad in netbox, I think you should create also the codfw one and reserve it, to avoid people picking it and breaking the symmetry [09:55:45] :? [09:56:22] https://netbox.wikimedia.org/ipam/ip-addresses/13774/ [09:56:37] yeah.. I'm aware of that [09:56:39] there is no 10.2.1.89/32 [09:56:43] and what's the problem with that? [09:56:58] we always assign them in couple, to ensure that all services have the same byte in both DCs [09:57:01] it's a testing environment that shouldn't be replicated in codfw [09:57:12] and it's gonna be destroyed at some point [09:57:22] like https://netbox.wikimedia.org/ipam/ip-addresses/10435/ [09:57:43] you don't have to use it, just mark as reserved [09:58:02] this doesn't make a lot of sense [09:58:19] if we have such strong rules about this I shouldn't be allowed to allocate it only in eqiad [10:00:09] that will probably be possible when T270071 will be solved [10:00:10] T270071: SVC DNS zonefiles and source of truth - https://phabricator.wikimedia.org/T270071 [10:02:23] we could have a create SVC VIP cookbook that does the netbox change picking the first free one and running the dns cookbook [10:25:14] fwiw is part of the docs ;) "If this is a VIP, make sure you get the same last octect in both eqiad and codfw datacentres" https://wikitech.wikimedia.org/wiki/DNS/Netbox#How_to_manually_allocate_a_special_purpose_IP_address_in_Netbox [11:14:09] 10netops, 10Infrastructure-Foundations, 10SRE: Change EPVN RR setup to use single BGP group and different cluster ID on every RR - https://phabricator.wikimedia.org/T348583 (10cmooney) 05Open→03Resolved Changes pushed to production, closing task. [11:15:04] vgutierrez: FYI I've created the codfw one in netbox and will send a patch to the dns repo to fix missing commented lines ( https://netbox.wikimedia.org/ipam/ip-addresses/15096/ ) [11:21:15] cheers [11:39:30] claime: no need for a specific wikifunctions.org rest.php rule [11:39:44] claime: unless you're adding it for clarity sake [11:40:11] vgutierrez: Why? It doesn't decode/encode the same characters as the "normal" wikifunctions rule [11:41:22] hmm true [11:41:28] 0x27 (single quote) is also needed for rest.php [11:41:30] nevermid [11:41:50] I thought the same as you, but j.ayme pointed out the difference [11:42:25] he deserves some extra beers today [11:42:35] Totally does. [13:43:48] 10Traffic, 10Abstract Wikipedia team, 10SRE, 10Wikifunctions, and 2 others: Separate deployment for wikifunctions.org - https://phabricator.wikimedia.org/T347544 (10Jdforrester-WMF) [14:23:07] 10Traffic, 10Abstract Wikipedia team, 10SRE, 10Wikifunctions, and 2 others: Separate deployment for wikifunctions.org - https://phabricator.wikimedia.org/T347544 (10JMeybohm) [14:38:09] vgutierrez: I would like to do the LVS dance for the wikifunctions stuff from earlier (https://gerrit.wikimedia.org/r/c/operations/puppet/+/965175/) is now a good time? [14:38:20] * vgutierrez checking [14:39:41] jayme: hit it :) [14:40:38] Once that dance is done I have yet another aqs2 service (third last one!) I'd like to switch over if it suits. https://gerrit.wikimedia.org/r/c/operations/puppet/+/964946 [14:40:54] The internal URLs look like https://rest-gateway.discovery.wmnet:4113/wikimedia.org/v1/metrics/pageviews/per-article/en.wikipedia/all-access/all-agents/Albert_Einstein/daily/2015100100/2015103100 and https://rest-gateway.discovery.wmnet:4113/wikimedia.org/v1/metrics/legacy/pagecounts/aggregate/en.wikipedia.org/all-sites/monthly/2014010100/2014020100 [14:41:05] both require host to be wikimedia.org [14:41:05] that's ATS stuff :) [14:41:18] yeah but figured it'd be best to avoid intersecting [14:41:18] fabfur: are you around? [14:41:33] yep [14:41:49] fabfur: do you want to take care of the ATS change? [14:42:02] sure [14:42:08] cheers 🍻 [14:42:52] * fabfur checking [14:42:53] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, 10Release-Engineering-Team (Seen): Serve production traffic via Kubernetes - https://phabricator.wikimedia.org/T290536 (10Clement_Goubert) [14:43:03] 10Traffic, 10MW-on-K8s, 10SRE, 10serviceops, and 2 others: Move 25% of mediawiki external requests to mw on k8s - https://phabricator.wikimedia.org/T348122 (10Clement_Goubert) 05Open→03In progress [15:01:50] we're good to go, I'm going to do the usual cp2037 dance [15:05:59] ack [15:18:18] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir5001.eqsin.wmnet with OS bookworm [15:25:00] cp2037 looks okay, I'm going to gradually enable puppet and roll out the change [15:28:32] ack [15:33:16] 👍 [15:38:42] fabfur: we're a proud ASCII team [15:38:43] 10Traffic, 10Abstract Wikipedia team, 10SRE, 10Wikifunctions, and 2 others: Separate deployment for wikifunctions.org - https://phabricator.wikimedia.org/T347544 (10JMeybohm) [15:47:52] still slowly reenabling puppet, everything looking okay [15:48:23] hnowlan: lovely [15:48:57] thumb up! [15:48:59] :D [15:49:19] I have issues resolving a newly created dnsdisc record (mw-wikifunctions.discovery.wmnet) consistently ... I think I missed something but I'm not sure what [15:50:32] ref: https://gerrit.wikimedia.org/r/c/operations/dns/+/965065 the geoip one works well afaict [15:51:31] but for mw-wikifunctions.discovery.wmnet I get NXDOMAIN [15:54:07] ignore me, that was already cached [16:06:50] A:cp puppet runs done, thanks for the help! [16:29:23] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir5001.eqsin.wmnet with OS bookworm completed: - ncredir5001 (**PASS**) - Removed from Pup... [16:52:58] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall) [16:59:47] hnowlan: hi! [16:59:55] cp2030 is depooled for cdn. can we repool it? [17:00:59] or wwere you working on just 2037? [17:03:51] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir3004.esams.wmnet with OS bookworm [17:26:09] sukhe: I only touched 2037 [17:26:35] hnowlan: thanks and sorry then :) [17:26:40] no action required from your side [17:35:36] 10Traffic, 10DNS, 10SRE: Update DNS records for Greenhouse - https://phabricator.wikimedia.org/T348335 (10Lhiraide) Hi @NMariano-WMF that would be great! Thank you all so much for your help! [17:41:59] 10Traffic, 10DNS, 10SRE: Update DNS records for Greenhouse - https://phabricator.wikimedia.org/T348335 (10NMariano-WMF) Hi @Lhiraide and @ssingh, I sent out an invite tomorrow. I don't think we'll need the full time for the meeting, but wanted to be safe just in case we did. Let me know if that time doesn't... [17:55:59] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir3004.esams.wmnet with OS bookworm completed: - ncredir3004 (**WARN**) - Downtimed on Ici... [17:56:13] 10Traffic, 10DNS, 10SRE: Update DNS records for Greenhouse - https://phabricator.wikimedia.org/T348335 (10ssingh) @NMariano-WMF: Thanks, accepted! [18:17:49] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall) [18:18:58] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir3003.esams.wmnet with OS bookworm [18:47:42] 10netops, 10Infrastructure-Foundations, 10SRE: CRs ECMP traffic to LVS VIPs despite higher MED on backup route - https://phabricator.wikimedia.org/T348446 (10cmooney) I lab tested this and the "always-compare-med" command works as expected (see P52912). >>! In T348446#9238640, @ayounsi wrote: > Some of our... [19:10:23] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir3003.esams.wmnet with OS bookworm completed: - ncredir3003 (**PASS**) - Downtimed on Ici... [19:12:03] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall) [19:12:30] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir2002.codfw.wmnet with OS bookworm [19:27:26] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10VRiley-WMF) [19:32:27] 10Traffic, 10DC-Ops, 10SRE, 10ops-eqiad: Q1:rack/setup/install cp11[00-15] - https://phabricator.wikimedia.org/T342159 (10VRiley-WMF) [19:54:49] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir2002.codfw.wmnet with OS bookworm completed: - ncredir2002 (**WARN**) - Downtimed on Ici... [20:04:59] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir2001.codfw.wmnet with OS bookworm [20:05:14] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall) [20:40:16] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir2001.codfw.wmnet with OS bookworm completed: - ncredir2001 (**WARN**) - Downtimed on Ici... [20:53:47] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall) [20:54:19] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir1002.eqiad.wmnet with OS bookworm [21:26:13] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir1002.eqiad.wmnet with OS bookworm completed: - ncredir1002 (**WARN**) - Downtimed on Ici... [21:30:57] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir1001.eqiad.wmnet with OS bookworm [22:05:32] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10ops-monitoring-bot) Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir1001.eqiad.wmnet with OS bookworm completed: - ncredir1001 (**WARN**) - Downtimed on Ici... [22:06:22] 10Traffic, 10SRE, 10Patch-For-Review: Upgrade Traffic hosts to bookworm - https://phabricator.wikimedia.org/T342154 (10BCornwall)