[06:26:24] Hello folks, checked again Benthos metrics and turnilo, so far all good with the new kafka client [06:37:30] <_joe_> elukey: yep looks that way [10:19:11] effie, Amir1, _joe_: what should be the next step for enabling parsoid cache warming? We have it enabled for small and medium wikis now... can we go all in next week? Or so we want to try just adding enwiki by itself? [10:19:48] <_joe_> duesen: I would normally go with everything but enwiki and wikidata, then those two by themselves [10:20:23] <_joe_> duesen: hnowlan and I don't remember, so you might help us on this [10:20:47] <_joe_> how do we handle transclusions in mediawiki? we just spawn refreshlinks jobs that spawn htlmcacheupdate jobs? [10:21:17] <_joe_> and, do we emit any event from refreshlinks that can be used to invalidate cache in services? [10:24:50] duesen: I am off today, we can pick this on monday, at a first glance I agree with joe [13:35:07] on-call team: going to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/919067. will be rolling it out in batches but in case something breaks, it's on me :) [13:42:00] sukhe: ack [13:42:02] checking [13:42:31] can I ask why are those hardcoded? [13:42:42] most certainly [13:43:01] they should not be, but we need to come up with a better structure for defining them in hiera [13:43:10] core sites should resovle to core sites but not to the host itself [13:43:18] edge sites to the nearest core but not to the host itself [13:44:19] because we currently don't have those definitions and after we did T330670, I didn't update the new dns boxes [13:44:20] T330670: Deprecating the dns::auth role and moving authdns[12]001 to dns[12]00[123]. - https://phabricator.wikimedia.org/T330670 [13:44:33] and that's currently preventing the provisioning of the new hosts that are just sitting in codfw and eqiad [13:44:52] and hence this is kinda a blocker for all the other work, so the quickest thing is to do this manually and then we will automate it later [13:45:35] to be clear I don't want to be a blocker, just that's a lot of hardcoding of all the same data [13:47:15] yes and to generate this data, I have a local Python script but to put it in hiera in such a way that we can use it and define that structure is something we are discussing but it will take time [13:47:53] * sukhe back to deployment for now, will read later here [13:48:08] ok [13:56:17] <_joe_> duesen, effie before we enable more jobs, I want us to take a hard look at the jobrunners cpus [13:56:28] <_joe_> it seems we're at 75% utilization, which is way too much [18:07:41] re: maint-announce mails: something like "peering partner FB says the 'BGP session may flap several times'" isn't worth forwarding to peering@ or adding anywhere.. right? it seems not actionable anyways [18:09:31] but then there was one that I definitely did forward to peering@ .. that was from Cloudflare and it is actually "peering session removal".. as opposed to those [18:12:08] it's in Singapore [18:54:13] I mean, technically if we have pre-notice that a session will flap a bunch in some window and we care about it, we could pull it temporarily to avoid a bunch of random interference from the flapping. [18:54:38] but on the other hand, I imagine our peering partner FB mostly brings us high-volume indirect traffic that hurts us more than helps us, so I kinda don't care. Might be different if it were Comcast or something :) [19:35:57] thanks for the input, Brandon, ack! :)