[00:02:48] TimStarling: I'm trying to determine what kind of skew to expect from using Excimer with an interval as small as 0.1ms (not seconds). That seems sufficiently small that depending on how the interval works, it might cause unexpected side-effects. [00:03:08] From reading the code, I believe it would not collect multiple samples from a function call lasting more than 0.1ms [00:04:31] and it would also not trip on itself if the intenral callback (not the flush one) takes more than the interval because it would presuambly just end up setting the same raised flag again which seems like that'd be harmless at least in terms of causing failures or build up beyond merely skewing the data. [00:04:56] I tested it when I first wrote it, to see what the lower limits were. I don't think it'll break, it'll just overrun [00:05:39] but eventually (I think before you get to 0.1ms) all additional samples you're trying to add will turn into overruns, so you won't actually be gathering additional data [00:05:55] just multiplying all sample counts by a factor [00:06:17] ah wait, I think I missed that last part. [00:06:43] so if the thread wakes multiple times before the flag is lowered, it does know and take this into account? [00:07:20] the interface that the C library provides does not guarantee to call you every time the period expires [00:07:36] by C library I mean timer_create() et al [00:08:01] oh, right, that would have a lower limit as well. [00:08:08] it will make some sort of best effort, and if it wakes up and discovers it's too late, it provides an overrun count representing all the events it missed [00:08:09] I was only thinking about the zend mechanism [00:08:34] yes, that's the other kind of overrun [00:09:05] I thought that if the C thread timer callback runs multiiple times it just raises the flag that was already raised, so when the zend callback runs we'd still collect just one sample and not know that we should count it as 2, for example, right? If true, that would potentially overrepresent fast functions and unrepresent slower ones if the interval is too short. [00:09:56] if the timer callback runs multiple times, it adds the overrun counts together [00:10:26] when zend eventually calls it back, it has a total overrun count consisting of all the events either delivered or reported as missed by libc [00:11:04] that overrun count is passed as a parameter to the userspace callback, or in the profiler mode, it is added as if that line of code were hit that many times [00:12:16] so all missed events end up being assigned to whatever line of code it is up to when the PHP VM calls back [00:15:33] Ah I see, so it's neither the theoretical perfect, nor the naive way I described earlier. I haven't understood it quite, but I'm getting there. [00:16:08] Is there a correlation between C missing timer windows and PHP VM encountering a slower function call that lasts multiples of the sample interval? [00:16:20] I woudl expect not but I might be missing some intentional link between the two [00:17:25] I haven't really investigated the cause of C overruns -- I just remember that they happened sooner than I expected [00:17:28] * Krinkle reads man timer_create(2) again meanwhile [00:17:44] I think they are caused by the kernel scheduler [00:21:48] I don't read anything about timers not firing if their handler/thread is still running. [00:22:08] so probably that's not done by defaut, or maybe it's too obvious to document. [00:22:45] my little intuition about C suggests this is not something it would do for you and may very well not be desired anyway, the thread doens't have to be shortlived. [00:26:51] the stateless nature of the two parts actually make this quite easy to reason about now that I know how each works more or less. [00:27:57] the zend half is going to clear whatever has been overrun and succesfully phoned in up to that point, so it seems that at most you'd have 1 unit of "minimum time between C timers" attributed to the wrong function. [00:32:01] if you're interested in the libc internals, have a look at thread_func in htl/timer_routines.c in the glibc source [00:33:34] I looked at it out of curiosity when I first wrote excimer -- I can't say I understand it all, but it has comments [00:36:41] My napkin says we can't profile a 200s POST req with 0.1ms interval without producing a ~ 2TB log file. [00:38:01] I haven't accounting for frame collapsing, but for a single request I don't expect that to collapse more than 10% of frames. [00:38:41] ok, enough talk, I'll get some actual numbers. [01:34:03] Thanks Tim :) [01:34:16] btw, I've got some terrible ideas at T291015 around excimer>flamegraph for mwdebug [01:34:17] T291015: Add per-request flamegraph option to WikimediaDebug - https://phabricator.wikimedia.org/T291015