[15:13:41] [1/6] There seems to be a wave of Chinese scrapers on some MH wikis, and someone should probably challenge them on CloudFlare. [15:13:41] [2/6] Images taken from https://bluearchive.wiki/wiki/Special%3AAnalytics. These bots originate from China, have Linux in their UA, and use a 4k monitor. [15:13:41] [3/6] Strinova Wiki's [analytics page](https://strinova.org/wiki/Special:Analytics) is publicly accessible and is experiencing a similar problem (though not as bad). There is no way a wiki primarily written in English gets 2x traffic from China compared with the US. [15:13:42] [4/6] https://cdn.discordapp.com/attachments/1006789349498699827/1416079325362520144/image.png?ex=68c58a24&is=68c438a4&hm=7ecf9bf0e597a2675cee2582288ef8aeebf057c5a472f428fe9e8b0630161e1f& [15:13:42] [5/6] https://cdn.discordapp.com/attachments/1006789349498699827/1416079325769498804/image.png?ex=68c58a24&is=68c438a4&hm=c890f4121db2050d4deead8d0c7e0193220b1234e05d404d3e4cff13c461e3b1& [15:13:42] [6/6] https://cdn.discordapp.com/attachments/1006789349498699827/1416079326050390016/image.png?ex=68c58a24&is=68c438a4&hm=9aa3feac6f4b318779a39f99efd278b4a61d6acc068797e1332bf4f2b82d27d9& [15:14:19] woohoo another bunch of chinese bots to block off [15:14:39] @rhinosf1 someone whispered to me that you like doing this [15:14:52] (or are you busy) [15:19:16] @pskyechology I only see 50K Linux requests from China in 24 hours [15:21:13] Most from Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36 Edg/139.0.3405.125 [15:21:29] They don't stand out as suspicious tbh @pskyechology @posix_memalign [15:23:08] [1/2] I'd be much more suspicious of Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 who have such sillyness going on [15:23:08] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1416081703931351051/image.png?ex=68c58c5b&is=68c43adb&hm=c9dc89d8c298e61cc4a36988379598c19e532e53856a6927bd54131e4cf36a80& [15:26:22] That's a big jump for an old version [15:45:03] Could be something like https://discord.com/channels/407504499280707585/615786602454581249/1403319099811041425 where the UA is set to Windows but elsewhere the header says Linux. [15:45:56] Interesting pattern [15:47:44] Let me try that in Log [15:49:01] [1/2] I'm still blocking quite a bit of traffic with this pattern. The source region is no longer just Hong Kong and I'm seeing several South American countries. [15:49:01] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1416088219568836619/image.png?ex=68c5926d&is=68c440ed&hm=22f90fdb68c0f67d35ed3f4f51f4f48c65290739535af43d748be3707d2026b8& [15:50:01] That's had a quick impact [15:50:26] I'll give it an hour to slurp some data and then maybe throw a challenge [15:50:55] That looks like bot traffic though [15:51:12] [1/5] Looks like the requests on my end all have [15:51:13] [2/5] ``` [15:51:13] [3/5] Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36 [15:51:13] [4/5] ``` [15:51:14] [5/5] The Chrome version really stands out [15:51:25] I've logged like 8k in 5 minutes [15:51:50] Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 is our top UA for that rule [15:52:07] Which is the one @pskyechology pointed out [15:52:23] yay [15:52:39] I think you found something @pskyechology [15:53:19] I'm going to give it an hour to let the data settle into an average [16:02:32] The Chinese bots in my experience tend to visit a lot of history, diff and uselang pages [16:02:53] So perhaps you could challenge them based on that [16:03:04] If it's really urgent [16:03:55] Fandom literally stopped rendering diff links on history pages because of bots [16:04:08] [1/2] https://discord.com/channels/407504499280707585/1415919653557108810/1415919653557108810 [16:04:08] [2/2] Both of these categories are still giving me "Gateway time-out". May need someone from tech to investigate. [16:05:16] You can tell bots from the patterns but Cloudflare isn't that smart [16:06:29] Well, yeah [16:06:42] But you can make a filter for that [16:13:59] [1/4] Restricting diffs and older source code to logged-in users sounds like a CC license violation. https://creativecommons.org/licenses/by-sa/4.0/legalcode.en [16:13:59] [2/4] > No downstream restrictions . You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material. [16:14:00] [3/4] Only a lawyer might have some ideas whether this clause applies to this situation, though. [16:14:00] [4/4] Quite a few people were upset when moegirlpedia did this. The response was "fine, then sue us for CC-license violations". Hopefully these restrictions don't apply to the api. Otherwise archival efforts would become much more difficult. [16:15:27] fandom might be big enough for someone to care [16:16:01] wish we could borrow some of that huge WMF budget to go about shenanigans like challenging that [16:23:43] It wouldn't be the first time they've been sued they probably don't care [16:24:00] Same way they modify visual editor etc al and don't release the source code [16:24:33] When it was suggest in the MCR server, it was very quickly shot down by us as fundamentally wrong. [16:25:05] Quite difficult from Cloudflare's dashboard to build a filter on that [16:25:27] You'd need some complex rules to analyses groupings to find the most narrow that looks off [16:25:33] And looks off is hard to describe [16:25:43] It could be sudden changes in times of traffic [16:25:51] Or sudden starting up of old UAs [16:26:05] Or an unusual set of top pages [16:26:16] Or a different ratio of static:wiki traffic [16:26:52] It's easy to visualise when you know how to explore the data or to manually forensically look at [16:27:00] But hard for me to put into a rule [16:41:19] [1/2] VE is licensed under MIT, so that's probably how they get away with it. [16:41:19] [2/2] At least MH projects are under the GPL, though to prevent Fandom (or other companies) from changing the software and not releasing the source the AGPL might be necessary. [21:31:38] is anyone here good with JavaScript?! [21:33:31] javascript's a very widespread language, what specific javascript are you talking about [21:33:49] API requests mainly, I think I'm either stupid or medaiwiki is stupid [21:34:44] it's just `fetch` or whatever library #5034 decides is the new better `fetch` [21:35:20] [1/18] If I do a mw.Rest request and it returns an error: [21:35:20] [2/18] ``` [21:35:20] [3/18] { [21:35:21] [4/18] "error": "some string here", [21:35:21] [5/18] "httpCode": 400, [21:35:21] [6/18] "httpReason": "Bad Request" [21:35:22] [7/18] } [21:35:22] [8/18] ``` [21:35:22] [9/18] I can't seem to get access to that if I'm doing something like [21:35:22] [10/18] ``` [21:35:23] [11/18] api = new mw.Rest(); [21:35:23] [12/18] try { [21:35:23] [13/18] api.post(...) [21:35:24] [14/18] } catch ( error ) { [21:35:24] [15/18] console.log( error ); [21:35:25] [16/18] } [21:35:25] [17/18] ``` [21:35:26] [18/18] it's just throwing "http" as the error everytime and its pisisng me off [21:35:36] devtools network tab? [21:36:08] ah wait [21:36:15] ah yeah, i _think_ that's a jquery thing [21:36:19] i ended up using fetch [21:36:59] yeah, its there like the response is there which is why i'm getting pissed off lol [21:37:31] [1/2] https://doc.wikimedia.org/mediawiki-core/master/js/mediawiki.api_rest.js.html#line201 [21:37:32] [2/2] Oh its here lol [21:37:35] wmf are fucking idiots bad [21:38:36] [1/7] the classic ```js [21:38:37] [2/7] try { [21:38:37] [3/7] ... [21:38:37] [4/7] } catch (e) { [21:38:38] [5/7] throw new Error; [21:38:38] [6/7] } [21:38:38] [7/7] ``` [21:38:56] it just throws http becaue of the link above [21:38:57] Are you doing async/await or then chaining [21:39:10] one sec I'll post a snipped [21:39:33] Because in mw.Api the additional info should be the second parameter of catch iirc [21:39:42] I'm using mw.rest [21:39:45] The first is just the code [21:39:51] Well I imagine it's similar [21:39:58] Can you send [21:40:43] second... parameter of catch? [21:40:45] you can do that? [21:41:02] In jQuery Deferred yes [21:41:05] [1/17] ``` [21:41:05] [2/17] async function addNewDomain( wikiId, domain, reason ) { [21:41:05] [3/17] try { [21:41:06] [4/17] const response = await api.post( `.....`, { [21:41:06] [5/17] domain, [21:41:06] [6/17] reason [21:41:07] [7/17] } ); [21:41:07] [8/17] return response; [21:41:07] [9/17] } catch ( error ) { [21:41:08] [10/17] if ( error.message ) { [21:41:08] [11/17] throw error; [21:41:08] [12/17] } [21:41:09] [13/17] // api didn't return the error message [21:41:09] [14/17] throw new Error( mw.message( 'error-cc-generic-error' ).text() ); [21:41:10] [15/17] } [21:41:10] [16/17] } [21:41:11] [17/17] ``` [21:41:12] In promises no I don't think so [21:41:18] oh writing js doesn't make it js lol [21:41:29] it needs to be on the same line [21:42:06] (APi is defined earlier as new mw.Rest() btw) [21:44:18] [1/6] ```javascript [21:44:18] [2/6] (new mw.Rest()) [21:44:18] [3/6] .get('/aaa') [21:44:19] [4/6] .then(d => console.log(d)) [21:44:19] [5/6] .catch((code, info) => console.log(info.xhr.status)); [21:44:19] [6/6] ``` [21:44:28] Prints the error code for me [21:45:15] let me try this [21:46:04] Well for 400s it's kind of ass [21:46:08] [1/5] ```javascript [21:46:08] [2/5] (new mw.Rest()) [21:46:08] [3/5] .get('/v1/search/title') [21:46:09] [4/5] .then(d => console.log(d)) [21:46:09] [5/5] .catch((code, info) => console.log(JSON.parse(info.xhr.responseText)));``` [21:46:18] But you can get the body this way' [21:46:27] how cursed, thanks jquery [21:47:09] what the actual fuck is going on with JavaScript. [21:47:19] This is why PHP is better. [21:47:24] Really odd it fails with `http` though, mw.Api would use the error code there [21:47:54] y'all have `meowmeow\homosexuals::class` be valid even if `meowmeow\homosexuals` is not an existing class [21:48:01] Though api.php always return a 200 so that's probably the reason [21:48:44] [1/11] Yeah, i'm purposefully doing [21:48:44] [2/11] ``` [21:48:45] [3/11] if ( !$result->isOK() ) { [21:48:45] [4/11] $error = Message::newFromSpecifier( $result->getMessages( 'error')[0] )->inLanguage( 'en' )->plain(); [21:48:45] [5/11] return $this->getResponseFactory()->createHttpError( [21:48:45] [6/11] 400, [21:48:46] [7/11] [ 'error' => $error ] [21:48:46] [8/11] ); [21:48:46] [9/11] } [21:48:47] [10/11] ``` [21:48:47] [11/11] So I want to be able to grab that error but clearly its not that easy using mw.Rest() [21:48:47] at least this way you can do stuff like `class_exists( Html::class )` [21:48:56] (even though nobody ever does it for some reason) [21:48:58] okay, true [21:49:00] Does rest.php have a standardized way of returning error codes, anyways? [21:49:04] okay, true T_T [21:49:43] Yes either ->createHttpError where you put the status code and your own body (which I used), or ->createLocalizedHttpError which is even worse and returns the error message in every language its defined in [21:49:46] dumb as fuck [21:49:51] y'all have `false` cast to `''` and `true` cast to `'1'` [21:50:13] php is just so fun i love it [21:50:15] can we cast yaron to hell [21:50:24] coming back to php after using something else is such a breath of fresh air [21:51:02] What's the deal with Yaron anyways I always just hear hate for the guy lol [21:51:18] I mean sure Extension:Widgets is a horrible idea but [21:51:24] the insecure code he writes oh my god [21:51:32] he doesn't even know what a csrf is [21:51:34] Every code hes touched is littered with security bugs [21:51:45] i sure hope cargo is a very secure extension with no issues at all [21:51:54] we need somerandomdeveloper to track him down and teach him how to use git patches [21:51:54] Security issue: the extension [21:51:58] i also hope he doesn't try to steal patches from people [21:52:08] Oh I've heard that one [21:52:15] ma'am, i was trying to figure out why Special:CargoQuery requires javascript [21:52:18] instead i stumbled upon xss [21:52:20] datatables [21:52:46] and they're ugly as fuck [21:53:01] foundation-paid trip to kick some ass frfr [21:53:22] I feel like he knows, he just refuses to [21:53:28] just as he doesn't like to prefix security commits with `SECURITY:` [21:53:42] ughhhhhhhhhhhhhhhhhhhhh [21:54:40] [1/3] he just doesn't listen [21:54:41] [2/3] https://cdn.discordapp.com/attachments/1006789349498699827/1416180238622261308/image.png?ex=68c5e820&is=68c496a0&hm=f9e4ac44efda4bbdd36704e7f906d7860a62c75696da4a60aa5a8f4d55090f4e& [21:54:41] [3/3] https://cdn.discordapp.com/attachments/1006789349498699827/1416180239268319313/image.png?ex=68c5e820&is=68c496a0&hm=653e7ab8bb528a1b2ad4fe7ebfc75d0c35b2ffd925798a19e107866945711a87& [21:55:11] he's like my parents frfr [21:55:29] rip [21:55:31] [1/2] I saw this just today [21:55:31] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1416180451441115307/image.png?ex=68c5e852&is=68c496d2&hm=304bd8a407a4e30c3efefe3667d05f546fd984f31d75be1660f5db80152a747c& [21:56:01] oh yeah true lol [21:56:17] even if i personally am not a fan of branching per mw version, i'll still cherry everything [21:56:17] it took him a bit less than 24 hours to create a tag after he fixed an SQLI [21:56:48] same but understandable how that might be a bit too much for some [21:58:06] [1/2] What I do for RobloxAPI is to create a branch for each minor/major release, and when I release a new bugfix patch, I backport it to the branch, so people can pin the repo to minor versions while still receiving bug or security fixes [21:58:06] [2/2] https://cdn.discordapp.com/attachments/1006789349498699827/1416181102325927967/image.png?ex=68c5e8ee&is=68c4976e&hm=604419404849cfb3e842ac406bdb51cbb64e0283ed31fda6fa7cab0fffa4bc5c& [21:58:19] this is why trans people are powerful: we have to reject societal norms to be happy; therefore, we can handle a little branching /hj [21:58:46] insane trans lore [21:59:14] RAAAAAGH WHICH ONE IS COMPATIBILE WITH MY MEDIAWIKI 1.29 THAT I CAN'T BE BOTHERED TO UPGRADE [21:59:22] idk check rel notes [21:59:31] WDYM I HAVE TO READ?????? [21:59:59] sorry, how do i use robloxapi on my mediawiki 1.9 installation? [22:00:18] that shouldn't matter for RobloxAPI because it always is compatible with all supported MW versions (except for 1.39; I will keep compat with 1.43 until it is EOL though) and there is no reason to clone an older version branch for a new installation [22:00:39] https://cdn.discordapp.com/attachments/867365805674201091/1329787538704961628/watermark.gif [22:00:52] why do you have this on the ready lmfao [22:01:06] because I send it like once every 2 weeks [22:01:28] and that is a real screenshot from a production wiki btw [22:01:33] oh god no [22:01:43] oh lord [22:01:55] yaron invited me to be on his wiki podcats [22:01:59] that happens when you clone the repo and don't rename the folder to RobloxAPI [22:02:00] swiftly ignored that [22:02:09] would've been funny tbh [22:02:18] I listened to his podcast recently [22:02:20] for like 2 mins [22:02:24] god, the amount of self-constraint i need to be avoid being confrontentional [22:02:51] my favorite combination of people, yaron + the bluespice maintainers [22:02:54] I gotta listen to that tomorrow [22:03:25] maybe we should move to #offtopic btw [22:03:27] given how much more expressive i've been recently, i don't think i have much self-constraint lol [22:54:06] Oh good