[10:12:32] does anyone have an up-to-date (read: post-k8s) version of https://wikitech.wikimedia.org/wiki/MediaWiki_Engineering/Guides/Measure_backend_performance#Benchmarking_and_load_testing_in_production ? [10:13:00] I tried to update those commands with the info from https://wikitech.wikimedia.org/wiki/Debugging_in_production#Directly but ab is reporting errors (“exceptions” – I can’t figure out how to get it to show me the output so I can figure out what it’s doing) [10:13:58] hmmm, the strace output seems to include `GET http://www.wikidata.org/wiki`, that certainly doesn’t look like correct HTTP to me [10:25:10] the following command seems to yield somewhat better strace output (it’s writing `GET /wiki/Q42 HTTP/1.0\r\nX-Forwar…` to the socket) [10:25:10] ab -n 1 -c 1 -H 'X-Forwarded-Proto: https' -H 'Host: www.wikidata.org' http://mwdebug.discovery.wmnet:4444/wiki/Q42 [10:25:10] but still an error (the read just returns 0 and then it closes the connection) [10:27:22] oh wait [10:27:28] https://wikitech.wikimedia.org/wiki/Debugging_in_production#Directly is still using TLS? [10:28:12] yeahhhhh, now I’m getting somewhere: [10:28:12] ab -n 100 -c 1 -H 'Host: www.wikidata.org' https://mwdebug.discovery.wmnet:4444/wiki/Q42 [10:40:13] Lucas_WMDE: See also "Directly by layer" section. No route to HTTP indeed. [10:40:13] I updated the wiki page, hopefully it’s not too wrong [10:41:00] Krinkle: thanks [10:41:34] actually I guess some of that text should still be rephrased [10:42:10] this is neither “load-test[ing] an application server” nor “target[ing] an mwdebug pod” – presumably each request to mwdebug.discovery.wmnet may be routed to a totally different pod within the group [10:42:32] (I briefly looked into mw-experimental but it doesn’t have ab installed. maybe it should?) [10:43:26] Indeed, you'd probably want this to run on mw-experimental [10:44:00] you can target that from deploy1003 as well or wherever you found ab [10:44:22] ok, let me try to find the right host name for that then [10:44:23] Did -x not work for k8s host? [10:44:45] I don’t think so, but I might have been doing something wrong [10:45:58] hm, this one just gave me a timeout: [10:45:58] ab -n 1000 -c 24 -l -H 'Host: test2.wikipedia.org' https://mw-experimental.eqiad.wmnet:4456/wiki/Special:BlankPage [10:46:13] (host and port from https://gerrit.wikimedia.org/g/operations/puppet/%2B/7ef71adb4d2dce90cb4dd5ec5f5f566694492e50/modules/profile/files/trafficserver/x-wikimedia-debug-routing.lua#34) [10:50:50] now I’m back to not understanding ab behavior [10:50:56] works from deploy1003: curl -i --connect-to ::wikikube-worker-exp1001:4456 'https://test.wikipedia.org/w/load.php' [10:51:01] doesn’t work: ab -n 1 -c 1 -l -s 2 -H 'Host: test.wikipedia.org' https://wikikube-worker-exp1001:4456/w/load.php [10:51:13] (-s shortens the timeout from default 30 to 2 seconds) [10:51:28] Right, I was gonna say, does it work with curl from ^ [10:52:55] Might be a TLS issue. Try FQDN in that second one [10:53:43] same result with: ab -n 1 -c 1 -l -s 2 -H 'Host: test.wikipedia.org' https://mw-experimental.eqiad.wmnet:4456/w/load.php [10:54:26] The curl version probably looks for Wikipedia cert, the latter probably hostname cert which is fine but that'd probably be for .eqiad.wmnet or the mw-experimental CNAME. [10:55:51] strace suggests it tries to connect() (via IPv6), which returns -1 EINPROGRESS, and then the subsequent epoll_wait() times out and ab closes [10:57:39] aha, if I use curl -6 it also doesn’t seem to work 🤔 [10:57:44] you’ve gotta be kidding me [10:58:50] and I can’t find anything like a -4 option in ab’s manpage [10:59:08] (though it does have an option to select “SSL2”… god knows if that’s even still compiled in) [10:59:57] this… works? ab -n 1 -c 1 -l -s 2 -H 'Host: test.wikipedia.org' https://10.64.16.131:4456/w/load.php [11:00:05] don’t ask me what it’s validating the TLS cert against… [11:00:15] but it says Complete requests: 1, Failed requests: 0 [11:00:40] anyway, meeting and then lunch, maybe afterwards I can keep looking into this and/or file a few phab tasks [13:17:40] reported the IPv6 issue at T428605 [13:17:41] T428605: deploy1003 sees an IPv6 address for mw-experimental.eqiad.wmnet but cannot connect() to it on port 4456 - https://phabricator.wikimedia.org/T428605 [13:20:47] and updated the docs again, cc Krinkle https://wikitech.wikimedia.org/w/index.php?title=MediaWiki_Engineering/Guides/Measure_backend_performance&diff=prev&oldid=2424840 [13:44:57] also filed T428614 [13:44:58] T428614: Install ab (ApacheBench / apache2-utils) on mw-experimental - https://phabricator.wikimedia.org/T428614 [16:30:39] * Krinkle glad to see https://github.com/molly/wikimedia-timeline/pull/36 landed. [16:30:52] https://www.mollywhite.net/timelines/wikimedia/ back up with pictures and all [17:39:42] ah the memories [17:40:28] Krinkle: could probably also do with a PR that updates brookes name where mentioned [18:39:45] I am reading that timeline and not seeing why they include Wikipedia in the knowledge engine, but not any of the sister wikis. If, for example, Google search was configured to pick them up (commons for images, wiktionary for words, wikijourney ir wikitravel or whatsitsname for locations, wikispecies for species, etc), then these sister wikis could have been developed better.