[14:08:17] I need help with a MYSQL query for media wik [14:08:33] Is there somebody good with MySQL? [14:08:45] don't ask to ask [14:09:13] I need to get these info from my Wiki: [14:09:14] list of pages in the wiki with username of who created the page, username of who made the last change, timestamps for these two actions, and text length. [14:09:14] I can't figure out in which way to join the users and the timestamps to the page_title and page_len columns. [14:09:15] Could anybody help me in building the correct query please? [14:09:47] Have you looked at the revision table? [14:10:21] yes, my problem is the JOIN query in MySQL [14:10:32] because I need all that info with just one query if possible [14:11:01] You might want to do two queries... One for the first and one for the last... But that's basically the same uery [14:11:34] problem is that then I won't have the user name of the person that made the last change on page_title [14:11:55] Why won't you? [14:12:26] because it's in another table than mw_page, right? [14:12:49] Yeah... But that's a simple join? [14:13:44] yes exactly, I'm asking for help in building that Join query because I'm good at MySQL but not enough for this [14:19:32] could you help me with the join query? [14:39:20] nobody? [16:25:26] Hi Guest49. [16:25:41] They left hours ago [16:25:56] Their loss, really. [16:50:17] Reedy: im quite surprised that the patch i proposed uses so much more memory apparently =o [16:50:50] addshore: heh. Something definitely feels odd, but it could be benchmark weirdness too [16:51:11] Definitely happy to try and help dig into it [16:51:26] But why I wanted to create the benchmark so we have some idea what we're actually doing [16:51:30] oh, haha, today is the day it gets deployed to candidate wikis [16:51:40] yeah, you're a bit late to the party :P [16:51:52] I mean, I know an ipset is much more compact, for large ranges and consecutive ips [16:52:35] Feel free to double check my benchmarking tests [16:52:47] I saw your patch adding the json serialization stuff, and thought you might be going to add some caching :P [16:52:54] :D [16:53:34] *stares at the banchmark* [16:53:57] Feel free to merge my other patches if you want so we've got a merged baseline etc [16:54:15] other patches?> [16:54:25] oh, the benchmarks? [16:55:35] some other cleanup and stuff [17:20:49] I'm not sure i trust your benchmarks, but also not sure why yet [17:21:39] well, I didn't write the code :P [17:21:50] but it's possible the 10 loops isn't having the desired effect [17:33:58] the benchmark of old code is doing getIpDenyList which avoids the IPSet? and the benchmark of my new code does the IPSet? [17:34:14] Entirely possible [17:34:20] Entry points are... limited :) [17:35:35] https://doc.wikimedia.org/cover-extensions/StopForumSpam/ [17:35:39] look at the awesome test coverage [17:44:31] Reedy: I don't think I can reliably run the benchmark on my shoddy internet [17:44:42] heh [17:44:51] but could you do https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/824249 PS1 vs https://gerrit.wikimedia.org/r/c/mediawiki/extensions/StopForumSpam/+/822727 PS11 ? [17:45:00] I mean, that's the other thing... It's not actually diffentiating for "time to download" (which is hard to do) [17:45:12] Maybe we increase testability, and use a file on disk etc [17:45:26] yeah, i think that should be possible by just configuring SFS [17:45:36] can you just stick a file in? I cannae remember [17:45:42] I think so [17:45:45] *reads the docs* [17:45:55] $wgSFSIPListLocation - location of an unzipped IP address blacklist [17:46:29] if ( is_file( $wgSFSIPListLocation ) ) { [17:46:29] $ipList = $this->fetchFlatDenyListHexIpsLocal( $wgSFSIPListLocation ); [17:46:31] ah, yeah [17:46:39] So that probably helps your local testability if you can grab it :) [17:46:40] *looks at which list it normally uses* [17:47:03] https://www.stopforumspam.com/downloads/listed_ip_30_ipv46_all.gz [17:47:28] We're using the 90 one in prod, fwiw [17:47:50] moar bigger [18:04:31] Reedy: "you've spelled recache wrong :P" lets pretend i didnt spend 15 mins investigating that .... [18:05:34] * Reedy invoices addshore for his time [18:05:50] I've just +2 the two patches to get them merged [18:07:42] cool, then ill rebase my newer code and profile them both [18:07:45] / benchmark [18:22:20] Reedy: loooks like the profiling makes more sense now, and the new code i wrote = more memory, but less time [18:41:37] Reedy: also benchmarked with the list your using on wmf sites now, and I think the patch is good ! :) [18:41:50] Sweet [18:41:55] I'll have another look myself :)