[01:31:12] I also don't have +2 on the repo [01:39:52] pprof for node (https://github.com/google/pprof-nodejs) is pretty nice, in case any of you need to profile any node apps [01:40:04] (and my sympathies if you do) [01:47:39] I tried node's internal profiler, '0x' (which also uses the internal profiler and tries to make its output more useful), and perf, but in each case the call stacks were a mix of native and javascript frames, with javascript functions often appearing in different guises when they've been recompiled by the JIT [02:00:43] Is calling it 0x a diss? [02:02:41] my comma was ambiguous; 0x is a helper tool that wraps the internal profiler. I don't think the name is a diss [02:04:09] perf + stackcollapse.pl + flamegraph.pl : https://calx.atdt.co/flamegraph.svg [02:09:24] pprof: https://calx.atdt.co/ui/ (directed graph) , https://calx.atdt.co/ui/flamegraph (flame graph) [03:31:59] ori: he, that's d3-flamegraph, I recognise the animations :) [03:33:05] I'm curious how this compared to eg. node.js with v8 perfmap since afaik it supports that. I'm guessing that's what it uses underneath but with some additional tooling on top to generate a directory of multiple HTML files or is /ui/ actually an app for multiple profiles potentially? [03:34:19] https://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html [03:35:13] it's been a few years since I used the d3-flamegraph but I recall there were some issues where it would sometimes add up nodes and add up numbers twice, thus leaf nodes were an order of magnitude smaller/thinner than they should be. [03:35:43] curious how e.g. nodejs with perf map compares to the pprof one with d3, ideally the same more or less, aside from top-down vs bottom-up [03:36:15] https://calx.atdt.co/ui/source [03:36:18] wow, that's quite neat [03:47:23] I'm updating my PHP pull request from January which finally got a code review a week ago [03:47:34] ran the tests under valgrind, there was an error [03:47:45] it comes from this line in libxml: [03:47:52] for (i = 0;i < 499;i++) { [03:47:52] upper[i] = toupper(name[i]); [03:50:51] ok I missed the second line of that loop body due to the fact that it is indented wrongly without an 8-space tab [03:50:59] if (upper[i] == 0) break; [03:51:26] still, it is a weird way to convert user input to upper case [04:16:24] Krinkle: /ui/ is proxied to the built-in pprof web server interface; it has a console-based UI as well. It can also process perf.data files generated by perf. One other option that I didn't explore is using the remote debugging interface, which allows you to collect and analyze profiles using Chrome's dev tools. [04:18:10] there's also Google Cloud Profiler which is gratis but not libre [04:21:34] it has some cool features but I think for this use-case (profiling a single invocation rather than a long-running server) it's just pprof with a material design mascara [04:29:52] btw sorry for the trouble but I can't create the tag either [04:47:13] ori: ack, it's in the submission pipelines of both mozilla and google now [05:28:01] great, thank you [05:30:51] TimStarling: SelectQueryBuilder seems oddly named...it doesn't just build queries but can run them. And JoinGroupBase seems like it should be a trait, not a parent class just code reuse. I'm just thinking about https://gerrit.wikimedia.org/r/c/mediawiki/core/+/810856 . [05:53:38] TimStarling: regarding ATS deploy, what exactly needed to "recover", that was taking minutes instead of seconds etc; was there some kind of conflict that caused running instances of the previous Lua code to start crashing or something? (I'm aware the config file you added has a few second cache but AFAICS neither the Lua params nor config file was changed for OAuth additions..) [05:57:37] Krinkle: https://grafana.wikimedia.org/d/JTAWecXGk/varnish-anomalies?orgId=1&forceLogin&var-datasource=esams%20prometheus%2Fops&var-cache_type=text&var-site=All&from=1659055550743&to=1659064207169&viewPanel=2 [05:58:12] I've seen several examples recently of the principle that a service can't monitor itself [05:58:38] so there's no point looking at the ATS dashboards to find out if ATS is healthy, if it's sick it's too sick to tell you [05:58:52] to find out if ATS is healthy you need to look at the Varnish dashboards [05:59:51] after restarting a few servers, we were hitting 1000 req/s of errors from ATS, visible to Varnish [06:00:25] at the peak it was also visible in the 5xx graph on the home dashboard [06:01:35] it took 5 minutes for this error rate to recover, so I did restarts with a 5 minute sleep, and you can see the oscillation in error rate that caused for the next 2 hours [06:02:19] AaronSchulz: I guess I should have subscribed you to T243051, where this was discussed in January 2020 [06:02:20] T243051: A query builder for MediaWiki core - https://phabricator.wikimedia.org/T243051 [06:03:11] originally it was supposed to be a value object, and the execute methods took a connection as a parameter [06:03:28] Daniel asked for it to be changed so that Database was the factory and the connection was stored [06:04:44] Krinkle: btw I prepared that dashboard in advance just for this restart, by porting it to Thanos so that it could show all error rates at once [07:58:12] TimStarling: ack, confirmed at https://grafana.wikimedia.org/d/-K8NgsUnz/home?orgId=1&viewPanel=8&from=1659051777875&to=1659081400483 - that's interesting. I believe there's a dew dashboards similar to that what were built back when we had ats-tls in front of varnish-frontend for a similar reason, as then we had to use ats to more properlty view varnish's health [07:58:31] I'm guessing SRE hasn't yet revised those for HAPRoxy [07:58:38] or perhaps it doesn't have the same kind of metrics [08:08:17] I'm still unsure as to why ATS was failing requests though, sorry if I missed it or if you explained it already, I guess we're both equally surprised it's that long, but would love to know why, even if it's not a good reason :D [08:08:43] I did not investigate [15:35:52] https://wiki.php.net/rfc/deque interesting rfc on adding some more specialized DSes to PHP, although it seems to have gone a bit stale [15:36:14] makes part of me wonder whether e.g. wikitext parsing could benefit from their use in some places