[08:40:46] [[Tech]]; 147.8.16.66; /* Incorporation and automatic reflection */ new section; https://meta.wikimedia.org/w/index.php?diff=22404396&oldid=22404273&rcid=20817208 [08:40:57] [[Tech]]; NguoiDungKhongDinhDanh; Undid edits by [[Special:Contribs/147.8.16.66|147.8.16.66]] ([[User talk:147.8.16.66|talk]]) to last version by ArchiverBot; https://meta.wikimedia.org/w/index.php?diff=22404397&oldid=22404396&rcid=20817209 [08:41:08] [[Tech]]; NguoiDungKhongDinhDanh; Undid edits by [[Special:Contribs/NguoiDungKhongDinhDanh|NguoiDungKhongDinhDanh]] ([[User talk:NguoiDungKhongDinhDanh|talk]]) to last version by 147.8.16.66; https://meta.wikimedia.org/w/index.php?diff=22404398&oldid=22404397&rcid=20817210 [11:31:09] Could someone link me to a website where I can learn how MediaWiki's templates get generated, parsed and generated? I'm planning to implement a parser for Wikipedia's citation in C++ (as a command like tool that'd transpile templated .html file to generated .html file) [12:12:21] fentanyl: not sure about a website as such, but a general MW parser is very complex [12:13:20] implementations of an MW parser include: [12:15:19] the original internal PHP parser in Mediawiki itself (https://github.com/wikimedia/mediawiki/tree/master/includes/parser), which is a bit Lovecraftian so a replacement was made.... [12:15:25] https://www.mediawiki.org/wiki/Parsoid [12:16:01] there is also an implementation in Python called mwparserfromhell (which might tell you about what writing a parser feels like) [12:17:06] there are some more here: https://www.mediawiki.org/wiki/Alternative_parsers [12:58:55] has someone written a complete Rust one yet? :) [12:59:56] l.egoktm probably has [13:00:42] at least bindings for the parsoid api: https://gitlab.com/mwbot-rs/parsoid [13:49:36] the author of https://docs.rs/parse_wiki_text/latest/parse_wiki_text/ is seriously grumpy [14:14:49] the git repository for that package also seems to have vanished from GitHub o_O [14:15:06] parsing Parsoid HTML with a decent HTML5 parser is probably a better idea than parsing Wikitext directly [14:25:50] Emperor: C++ > Rust [14:25:52] * fentanyl hides [14:35:02] * Emperor removes fentanyl from their Christmas card list [14:36:42] * fentanyl :'( [16:52:56] [[Tech]]; 77.141.213.156; [none]; https://meta.wikimedia.org/w/index.php?diff=22405668&oldid=22404398&rcid=20820868 [16:53:24] [[Tech]]; Hasley; Reverted edits by [[Special:Contribs/77.141.213.156|77.141.213.156]] ([[User talk:77.141.213.156|talk]]) to last version by NguoiDungKhongDinhDanh: test edits, please use the sandbox; https://meta.wikimedia.org/w/index.php?diff=22405669&oldid=22405668&rcid=20820869 [18:23:44] indeed, these days there's not that much value in writing a parser, just grab your language's preferred HTML parser and use that plus Parsoid HTML [18:24:05] the first rule of parsing wikitext is "don't parse wikitext" [18:25:09] well said [18:26:22] fentanyl: if you haven't found it yet, https://www.mediawiki.org/wiki/Specs/HTML/2.2.0/Extensions/Cite is how citations are marked up in Parsoid HTML. [18:30:51] AntiComposite: https://bash.toolforge.org/quip/fBmQgX0Ba_6PSCT9HWJo [20:29:57] legoktm[m]: huh, interesting. [20:31:51] legoktm[m]: yeah, but i've never written a parser, and so that'd be a good exercise (?), but seems like Parsoid does what I wanted to do (pass a wikitext file to command like that'd transpile and spit out HTML, which i can later copy-pasta) [20:32:54] well, i've written parsers, but that's mostly protocol parsers and implementing state-machines for specifications/RFCs, but not source files... [21:10:23] Ok, it seems like the citation is not quiet right. I used the parser, but I wanted it to link things within '[[]]' to en.wikipedia.org and when I hover over the citation, I wanted it to show a little pop-up box (like wikipedia) and when I click it, itshould take me to the biblography section. So, it doesn't quite work as I expected. Am I missing something? [21:10:33] See this: http://net0.sh3ll.ru/foo.html [21:10:47] I have included two sample citations. [21:11:58] (a) when I click that, it redirects me to 'Main_page' and (b) when I hover it, it doesn't have any pop-ups [21:13:15] These are the available options for parse.php: https://paste.debian.net/plain/1221876 But not sure which does this behaviour I was looking for [21:14:23] you're missing the `` tag that Parsoid uses to make relative URLs point to en.wp [21:14:35] also the hovering popup box is implemented in JavaScript, not the parser [21:14:53] https://www.mediawiki.org/wiki/Reference_Tooltips [21:16:54] legoktm[m]: Hmm, not sure where to put that tag though. [21:17:44] (also I thought parser also generates JS) [21:18:06] it goes in the head, see e.g. https://en.wikipedia.org/api/rest_v1/page/html/El_Tatio [21:19:02] the parser controls some JS that gets loaded, but reference tooltips are separately loaded as a gadget [21:22:57] legoktm[m]: Yeah, placing that takes me to the en wiki, but when I click the citations, that also takes me to en wiki (and not below) Am I doing something incorrect? [21:23:50] well there are no reference tags on the main page... [21:24:26] legoktm[m]: try clicking the [1] at http://net0.sh3ll.ru/foo.html [21:24:46] I expect it to take me to "Biblography" section (below) [21:25:07] this is my test file: https://termbin.com/xy3h And, it's my output: https://termbin.com/ewbe [21:59:36] seems like that's the effect of base tag: https://stackoverflow.com/a/1889957 [22:02:37] Or, I should use an middleware server (same server locally on different port/vhost) that'd receive the URI, and if it founds a `#` it'd use the current website's URI, and if it doesn't find `#` it'd redirect that request to en.wiki [22:03:02] *middleware listener on same server I mean [22:03:34] but this just seems shenanigans [22:04:25] probably a perl server with regex would do the job [22:05:23] so that URL would point to that intermediary