[07:38:20] hello, what's the best way to import the wikipedia xml dump? I don't have internet access (only via my VPSes)... [07:44:19] oh, well, I guess I look through the local docs :/ [11:49:37] synfin: MW has an "import dump" maintenance script [11:49:44] Reedy: it's slow, though [11:49:46] like, really slow [11:49:49] I've tried [11:50:03] Are you importing a big dump? [11:50:46] And if it's slow... It can be many reasons than just specifically the script [11:51:21] yeah, wikipedia dump [11:51:29] it was the PHP import script, last I remember [11:52:45] Based on what? [11:53:06] I don't know, but it was like 160GB or so [11:53:17] it was massive, really [11:53:26] *the dump [11:53:34] Wikipedia's are big [11:53:39] And that was probably compressed too [11:53:44] yeah, it was [11:53:47] I had to uncompress it [11:54:06] pretty sure I downloaded it at the start of 2020 [11:54:31] but then again, think I got rid of it after having storage space issues [13:07:44] synfin: 160GB dump will take weeks [13:08:09] I think there's a flag to slightly improve speed [13:08:43] Try with --no-updates [13:08:57] It should give you progress [13:09:03] With --report [13:09:37] You're talking over a billion edits for a history dump though [14:43:09] Change on 12meta.wikimedia.org a page Tech was modified, changed by 80.153.62.175 link https://meta.wikimedia.org/w/index.php?diff=21870469 edit summary: [-258] [14:48:59] Change on 12meta.wikimedia.org a page Tech was modified, changed by MdsShakil link https://meta.wikimedia.org/w/index.php?diff=21870495 edit summary: [+258] ArchiverBot-এর করা 21860954 নং সংস্করণে পুনরানিত হয়েছে; rv ([[:bn:ব্যবহারকারী:Al Riaz Uddin Ripon/পুনরুদ্ধারকারী|পুনরানয়ন]]) [15:04:39] do not import using the mediawiki import script [15:04:52] find something that converts (in pieces) your xml to sql for import [15:05:10] then import those directly. it's the only chance you have of getting it done in a reasonable length of time [16:07:11] https://meta.wikimedia.org/wiki/Data_dumps/Tools_for_importing [16:08:49] (though all the tools there are too old to work) [18:32:43] Reedy: we need a namespaceDupes.php run on wikimania wiki https://wikimania.wikimedia.org/w/index.php?title=Talk:2021:Submissions&oldid=95544 [18:33:27] when was the 2021_talk: namespace created? :o [18:46:11] It was executed: https://sal.toolforge.org/log/_e5s5nkB1jz_IcWuOpCh & https://sal.toolforge.org/log/qkhs5nkB8Fs0LHO5wIPP (see T284442). Why did it not work? [18:46:11] T284442: Retrieve existing pages created for 2021 Wikimania before creation of 2021 namespace - https://phabricator.wikimedia.org/T284442 [19:12:45] Nemo_bis: unless someone has run it since you asked... [19:12:45] 0 pages to fix, 0 were resolvable. [19:12:45] 0 links to fix, 0 were resolvable, 0 were deleted. [19:15:32] Thanks. Hmm. Maybe having 2 illegal prefixes is too much? [19:15:56] 2021-08-11 19:15:15 [6478dd24-0750-490a-89fe-6e1dbad6d6fa] mw2380 wikimaniawiki 1.37.0-wmf.17 exception ERROR: [6478dd24-0750-490a-89fe-6e1dbad6d6fa] /w/index.php?title=Talk:2021:Submissions&oldid=95544 Wikimedia\Assert\PostconditionException: Postcondition failed: makeTitleSafe() should always return a Title for the text returned by getRootText(). {"exception_url":"/w/index.php?title=Talk:2021:Submissions&oldid=95544","reqId":"6478dd24-0750-490a [19:15:56] -89fe-6e1dbad6d6fa","caught_by":"entrypoint"} [19:16:04] * Reedy dumps the stacktrace in a bug [19:17:43] thanks [19:17:57] https://phabricator.wikimedia.org/T288648 [19:18:01] Tagged platform and marked as high for now... [19:18:17] https://wikimania.wikimedia.org/w/index.php?title=Talk:2021:Submissions&action=info [19:18:20] >The requested page title refers to a talk page that can not exist. [19:18:29] https://wikimania.wikimedia.org/w/index.php?title=2021_talk:Submissions&action=info [19:18:45] Are there pages actually inaccessible? Or just some weird links? [19:19:48] The original revision of that page is not accessible [19:19:59] I got the broken link from https://en.wikipedia.org/wiki/Special:Contributions/Nemo_bis [19:20:24] https://web.archive.org/web/20210602233345/https://wikimania.wikimedia.org/wiki/Talk:2021:Submissions [19:20:50] lol [19:20:56] Bad error handling for sure [19:21:04] Sorry I mean https://wikimania.wikimedia.org/wiki/Special:Contributions/Nemo_bis of course [19:24:45] How did that even get created? :P [19:30:20] Probably by clicking the talk page link? :) https://www.mediawiki.org/wiki/Talk:1234:Foo is a totally valid talk page just like https://en.wikipedia.org/wiki/Talk:2001:_A_Space_Odyssey [19:31:15] (Or am I missing something? I'm getting confused about the chain of events.) [23:33:00] I just wanted to ask if anyone can and wants to review this patch: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/709787 [23:33:46] It's about Serbian Latin and Ijekavian magic words and special page aliases