[09:45:51] <_joe_> sorry I had missed this conversation. https://gerrit.wikimedia.org/g/operations/mediawiki-config/+/3ff486473879abe7eafcf61e60f09784ee5071eb/wmf-config/ProductionServices.php#78 needs to be fixed ASAP. [09:46:04] <_joe_> https calls cross-dc are super expensive [09:46:19] <_joe_> without proper session reuse, that is [09:46:27] <_joe_> which the service mesh provides [09:49:50] I think that's testwiki only [09:50:57] oh, no longer true aparently [09:54:23] inflatador, dcausse, kostajh: Please see the above - is there a phab task for this (since there is none attached to the mw-config changes related to this) [09:55:28] yes, it's linked in the git blame [09:55:31] T410615 [09:57:46] ah, the initial change - I see. I was looking at the changes that enabled this on all wikis [10:03:43] jayme: can you propose what the configuration should be changed to? [10:03:54] kostajh: tell us if you need help standing up the service mesh listener etc. https://wikitech.wikimedia.org/wiki/Envoy#Add_a_new_service_(listener) [10:06:23] maybe this https://phabricator.wikimedia.org/T406876 would be the better task to reopen [10:11:17] did that as well. The %data% SREs should be able to pick that up [11:00:52] wikifunctions.org is served by a dedicated mw-in-k8s deployment, is it intentional that the new abstract.wikipedia.org wiki is not and instead uses the normal set of appservers? [11:07:28] Hmm that's a good question, I don't think that was particularly discussed, rzl may know more (SRE ambassador for AW), and we may need to discuss that in our team meeting (cc _joe_ ) [11:11:24] <_joe_> taavi: uhm, no :) [11:13:15] I can do the ATS patches to correct that [11:15:43] Aaaah mw-wikifunctions is an ingress deployment of mw-on-k8s [11:15:46] * claime mild panic [11:21:48] that *is* on purpose [11:21:50] :) [11:22:00] Oh [11:22:05] it was deemed a good test candidate for mw behind ingress and the single version strategy [11:22:13] the banker did that some time ago [11:22:14] Ah, you mean the ingress? [11:22:16] yeah [11:22:49] I know it's on purpose :D I'm just trying to figure out what I need to change to the service definition to make it work with abstract.w.o [11:28:58] Ok, I think I got that figured out. [13:27:56] _joe_ jayme , thanks for the ping. We are scheduled to turn up active/active for opensearch-ipoid later today (ref T417698), that should stop the cross-DC calls. We've also discussed enabling the service mesh in https://phabricator.wikimedia.org/T420638#11730247 , although that was for intra-DC calls. I'll spin off a separate ticket for that [13:30:26] inflatador: dc local calls bypassing the service mesh will obviously also have a latency/cost panelty [13:30:47] and a lack of visibility ofc [13:32:57] Agreed, it's on our radar already. We just didn't want to make major changes to the chart since the service is already faster than the old iPoid and we're in the middle of moving to a new upstream major version. But if y'all need us to do it ASAP we can [13:47:18] I don't feel like I can guesstimate what all that means in actual time. From our PoV this has to move to use the service mesh ASAP since it's a snowflake in the production realm that can be easily avoided [13:50:54] I dunno either, but I'll commit to at least having a CR up this week. I've already started to look at what it might take and I don't think it will be all that hard [15:06:06] _joe_ jayme just a heads-up that the active/active turnup for opensearch-ipoid was a success, so the cross-DC calls should be gone. I'll still prioritize adding the mesh as discussed above [15:06:26] 👍