[00:03:57] :) [00:04:26] would anything super bad happen if we raised the Varnish/ATS timeout to like 205s? [03:49:13] legoktm[m]: I asked about this a while ago as well, specfically, when I wrote that all down on the appserver (later moved to dedicated page partuilly) [03:49:44] I vaguely remember _joe_ explaining why and recall that it sounded like a good reason. [03:49:49] however, I do not remember what it was [03:50:33] note that the PHP execution timeout (210s) is also higher than the php-fpm and Apache timeouts (201s, 202s) [03:51:03] although I suppose that is somewhat more reasonable given cpu time vs wallclock time [03:51:20] so I guess the idea is that it's unlikely to be hit in less than 201s wt [03:53:50] but yeah, apart from the seemingly good reason I can't remember, it seems reasonable to change them such that they all "just" line up. [03:54:27] including shortening hte php-fpm time to 201s and raising the traffic layer timeout by a few seconds [03:54:52] also would be great to figure out what's going on with appserver-level TLS (Envoy now instead of Ngninx) since the docs on that seem wrong/incomplete. [06:12:10] <_joe_> max_execution_time is meaningless and evil, we decided we don't want to hit it [06:15:18] <_joe_> legoktm[m]: the point is nothing at the ats/varnish layer should take 200s [06:15:52] <_joe_> if anything, I'd be in favour of reducing it progressively and make any process that takes that long asynchronous [06:19:04] ah the "unplug the cable, see who screams" method ;) [06:33:51] <_joe_> tn: not really; we know which requests take longer than X seconds [06:34:17] <_joe_> we usually try to prevent people screaming :) [06:34:55] no fun /j [08:24:24] XioNoX, topranks o/ - didn't find anything about the link between mr1-codfw and cr2-codfw, but it seems down on the cr2 side (can't find logs about what happened) [08:24:41] (didn't find anything in phab etc..) [08:28:44] elukey: https://phabricator.wikimedia.org/T294789 probably [08:28:53] I was out yesterday, still catching up [08:32:57] perfect thanks :) [09:18:07] Yeah that’ll be it. Mr1 should still be reachable via Cr1. [09:21:39] yep yep [10:37:29] heads up, I'm about to reboot cloud network components, some network flapping is to be expected, specially on IRC bots [11:31:10] another basic question about php package upgrades on appservers - do I need to schedule windows/roll the upgrades into deployments etc? [11:36:00] <_joe_> hnowlan: I don't usually do that, just !log referencing the bug [11:36:37] cool [11:37:16] yeah, don't bother. just make sure that to do the updates outside the existing deployment windows [12:17:35] FYI all if you didn;t notice WMCS is having some NFS issues which is affecteing among other things icr bost and CI (https://phabricator.wikimedia.org/T294828) [12:17:49] *irc bots [14:24:54] heads up, swift and maps no longer have http2 enabled (at all). I 'll proceed tomorrow one by one with the various elasticsearch clusters. I got +1s already for the changes and hopefully there aren't any users that will be impacted by downgrading to http1.0/1.1