[03:42:16] What's his name on Telegram? (re @tehreedy: I believe Daniel is) [04:12:25] Bantuan percuma kerajaan 2023 khas untuk yang mana bergelar Usahawan,Pendidik,Suri Rumah,Kerani dan yang kerja sendiri di waktu sekarang, boleh dapatkan geran RM2700-RM5000 🧕👩‍🍳👩‍💻👷‍♂️👨‍🎓👨‍🌾✅ Bantuan ni percuma✅ Tak perlu bayar semula✅ Maksimum sehingga RM2,700-RM5000Jom Claim, Sila Klik Pautan 👇https://shorturl.asia/cN0Ve : https://tools-static.wmflabs.org/bridgebot [04:12:36] Sila ambil kerana saya telah mencubanya,saya hanya meneruskan nya agar mereka yang memerlukan dapat di bantu.! (re @Rajuk ajuk: Bantuan percuma kerajaan 2023 khas untuk yang mana bergelar Usahawan,Pendidik,Suri Rumah,Kerani dan yang kerja sendiri di waktu ...) [07:39:16] Daniel (re @cvictorovich: What's his name on Telegram?) [07:40:56] @Daniel Shall we talk about something on parser development? [07:42:06] I cannot mention him here? (re @siebrand: Daniel) [07:43:55] Yes, I'd like to talk with him about something on parser development? [07:45:12] Something I would need his assistance [07:55:49] @djhartman DJ Hartman told me he can help [10:34:18] If he's available feel free to push me [11:14:21] 👋 (re @cvictorovich: I cannot mention him here?) [11:15:23] Bonsoir (re @Daniel: 👋) [11:15:38] What’s still missing in Parsoid? [11:16:03] The ultimate goal for it is to replace the old parser (re @cvictorovich: What’s still missing in Parsoid?) [11:18:55] There should be some information about this in https://m.mediawiki.org/wiki/Parsoid/Parser_Unification [11:19:16] Or, what needs the most major overhaul in MW kernel? [11:19:35] Yes, that's the idea. [11:19:36] The answer to your question is a bit complicated. Parsoid can generate output for everything that the old parser supports. [11:19:55] I elected restructuring part of MW as my scholar project [11:20:14] But for a lot of things, like parser tags and template substitution, it actually uses the old parser. [11:20:21] But I cannot revamp it from scratch [11:20:46] So in order to fully retire the old parser, all the extensions that supply parser tags and such would have to be changed. [11:21:35] Ah, no you can't... Finding a suitable piece for you to work on depends on how much time you have, and what you're level of experience is (re @cvictorovich: I elected restructuring part of MW as my scholar project) [11:22:53] Can you send me an email to dkinzler@wikimedia.org with some more background? Then I can discuss it with some colleagues. [11:22:57] Again, current code base is too clunky everywhere! [11:23:16] Yea, tell me about it... (re @cvictorovich: Again, current code base is too clunky everywhere!) [11:23:42] Look at PageTriage and FlaggedRev [11:23:58] Incompatibility is severe [11:24:53] PageTriage is mostly unused I think. Amir recently fixed the worst parts of FlaggedRev, bis it's really broken by design. [11:26:22] They’re the best extensions (re @Daniel: PageTriage is mostly unused I think. Amir recently fixed the worst parts of FlaggedRev, bis it's really broken by design.) [11:26:50] Best is what way? [11:26:51] But not their designs (re @cvictorovich: They’re the best extensions) [11:27:21] Features (re @Daniel: Best is what way?) [11:28:28] It would be good: I need to compose an email (re @Daniel: Can you send me an email to dkinzler@wikimedia.org with some more background? Then I can discuss it with some colleagues.) [11:29:52] To be more precise: they're the most useful tool for large projects (re @cvictorovich: Features) [11:32:46] It would seem that way, but experience really doesn't show them to be very useful in practice. (re @cvictorovich: To be more precise: they're the most useful tool for large projects) [11:33:03] But if you are interested in improving FlaggedRev, I have a couple of ideas [11:33:30] Send me that email, I'll have a closer look. [11:35:58] Which one is more valuable, parser or FR? [11:38:08] I think there is a lot of value where they meet. E.g. flagged revs doesn't support Parsoid output at the moment. Fixing threat in a good way requires a good deal of redesign. [11:39:33] That seems like a nice projects. It's hard to predict the traps you will find along the way, though. So it's hard to know how long it would take, or how far you would get in a certain period of time. [11:39:48] How much time do you have? Weeks? Months? Years? [11:39:54] Months [11:40:09] Full time? Or as a side project? [11:40:21] Have you contributed to me code before? [11:40:22] Nearly full-time [11:40:34] Yes: HotCat (re @Daniel: Have you contributed to me code before?) [11:41:01] Ah nice! That is a very useful tool! [11:41:02] (Vide HotCat page on mediawiki.org [11:41:13] It's all client side JS though, right? [11:41:31] But TBH it also needs major rework as an extension [11:43:25] https://gerrit.wikimedia.org/g/mediawiki/extensions/HotCat [11:43:41] My uploads in this repo [11:49:59] In my mind I'm ahead having you redesign ParserCache 😁 [11:50:00] [11:50:02] I'll have a look at your mail and then we'll see. The big question really is what level of support we can commit to as a team. [11:50:53] Afaik we have a b round of internships coming up, so we may be spread thin wrt time for coffee reviews and discussing design. [11:51:08] Bit we'll figure something out [12:12:48] Email sent [13:50:30] How is PageTriage unusued? The open letter asking the WMF for better support of it had 400+ signatories a year ago. (re @Daniel: PageTriage is mostly unused I think. Amir recently fixed the worst parts of FlaggedRev, bis it's really broken by design.) [13:56:12] The codebase of FlaggedRevs is very old and could use improvement in many ways. I don't think it's particularly problematic at the design level. It does some things that could probably done more cleanly if MediaWiki core had a proper extension point for them (e.g. splitting the parser cache by the permissions of the reader). (re @Daniel: PageTriage is mostly unused I think. Amir [13:56:12] recently fixed the worst parts of FlaggedRev, bis it's really broken by design.) [13:57:03] if I'm understanding the wikimedia config correctly, the only non-test wiki that has it enabled is enwiki (re @gtisza: How is PageTriage unusued? The open letter asking the WMF for better support of it had 400+ signatories a year ago.) [13:58:26] It's used by enwiki, yes. There are other wikis which would like to use it but the extension would need significant changes for that. [13:59:18] Enwiki uses it heavily and it's our biggest wiki so it's technically heavily used [13:59:50] I disagree FR is good at design level, it needs a lot of rethinking (re @gtisza: The codebase of FlaggedRevs is very old and could use improvement in many ways. I don't think it's particularly problematic at t...) [13:59:58] It also has a Codex front end and we did some refactoring of the backend earlier this year. It’s in decent shape imo [14:00:41] Its design could use some love 😞 (re @kostajh: It also has a Codex front end and we did some refactoring of the backend earlier this year. It’s in decent shape imo) [14:07:15] Daniel Saw my email? [14:08:18] Then which one needs most rework? [14:09:19] It's a weekend please give him some break [14:09:41] Sure, but I must know what to work on first [14:10:04] He will respond when he can [14:10:05] I never demand someone work regardless of time [14:10:07] it just depends how you define "used", I don't think daniel was being inaccurate by saying an extension that isn't used by most wikis is "mostly unused", even if you can also say it's heavily used by the one place that does use it [14:10:47] Your ping is an implicit demand (re @cvictorovich: I never demand someone work regardless of time) [14:11:00] Even if it’s only used on en, it’s still very very heavy usage (re @Nikki: it just depends how you define "used", I don't think daniel was being inaccurate by saying an extension that isn't used by most ...) [14:12:00] He asked me to drop him an email; I may have to know he received it, however I don’t demand immediate details on this issue (re @Ladsgroup: Your ping is an implicit demand) [14:13:43] Let alone details can barely be given without due consideration [14:16:44] It's very inaccurate. ~20% of new article creations happen on English Wikipedia. An extension that's used to review 20% of new pages is not "mostly unused". (re @Nikki: it just depends how you define "used", I don't think daniel was being inaccurate by saying an extension that isn't used by most ...) [14:18:05] If we did the math of what fraction of reader pageviews end up being affected by PageTriage, it would probably be an even larger number. [14:30:00] Even at the moment I'm making use of PT [14:30:20] It's much better than RC [14:34:26] as I said, it depends how you define "used". you wanted to know how it's mostly unused, and I explained how it can be considered mostly unused (99% of wikimedia wikis don't use it). yes, if you define it in a different way, it's also highly used, but both interpretations are true and I don't think it's productive to argue about which is the "right" meaning, it would be better to [14:34:27] suggest a clearer way of wording it (re @gtisza: It's very inaccurate. ~20% of new article creations happen on English Wikipedia. An extension that's used to review 20% of new p...) [14:36:10] I would say this tool can be deployed elsewhere [14:48:51] You can define "mostly unused" to mean "mostly used", and then you'd say very different things about what is "mostly unused", sure. Some definitions are more useful than others though. Treating an extension being used at English Wikipedia and being used at Afar Wikipedia as equally important is an unhelpful way of thinking that was way too common on the past and led the WMF to ma [14:48:51] ke some very bad prioritization decisions (which it has been fortunately moving away from for a while). [14:48:53] [14:48:54] Some 20-30% of new content creation and of content consumption happens on English WIkipedia. Enwiki content curation workflows are very central to the Wikimedia mission. Extensions that the enwiki community feels are very important to enwiki content curation workflows are mission critical. (re @Nikki: as I said, it depends how you define "used". you wanted to know how it's mostly [14:48:56] unused, and I explained how it can be considered...) [14:53:26] Well, if it isn't used on 80% of new article creations, then at least it is not used in the majority of cases. Perhaps "majority" and "mostly" have slightly different meanings, but I think they are mostly synonymous. (re @gtisza: It's very inaccurate. ~20% of new article creations happen on English Wikipedia. An extension that's used to review 20% of new p...) [15:03:42] "Mostly unused" has the implication that if something bad happens to it, it's not a huge problem. DPL is arguably mostly unused. Flow is mostly unused. PageTriage isn't mostly unused. [15:03:42] [15:03:44] 20% of new articles would be significant even if the importance of new articles would be evenly distributed, but it isn't. (For one thing, new articles on English Wikipedia are going to attract massively more pageviews than new articles on a random wiki. For another, new articles on enwiki typically represent extending coverage to topics which haven't been covered by the Wikimedi [15:03:45] a community before. New articles on other wikis often don't. That means both lower impact (there is added value in being able to read information written by a native speaker instead of using Google Translate, but not that much added value) and lower effort (a lot of new content is fully or partially translated from larger wikis or generated from structured data that was parsed fr [15:03:47] om larger wikis). [15:53:20] Are there available stats on where content first pops up? From a personal point of view, it seems like most articles on Swedish Wikipedia about Swedish related content originates there. If I remember correctly, there are over 20 million unique articles across all language versions, and enwiki covers only about one third of them so my anecdote seems to be supported by stats (albei [15:53:21] t in some roundabout way). [15:53:23] [15:53:24] From another aspect, it would be interesting to see how many users use this tool. If it is like any other power tool, then only a small proportion of the users on enwiki use it. (re @gtisza: "Mostly unused" has the implication that if something bad happens to it, it's not a huge problem. DPL is arguably mostly unused....) [16:18:53] I still don't think it's productive to argue about which meaning is the "correct" one. I answered your question, if you don't like the answer, that's not something I can help with. please argue with someone else, I'm not interested (re @gtisza: You can define "mostly unused" to mean "mostly used", and then you'd say very different things about what is "mostly unused", su...) [16:30:17] I'm not aware of such stats. You could probably look at where Wikidata properties are imported from most often. (My recollection of the overlap statistics was that enwiki covers about half but I might be misremembering.) I'd guess that most "local" content is first created on the wiki of that country/language (for languages which have a mid-sized wiki), but local content is a fai [16:30:18] rly small part of any given wiki. [16:30:20] [16:30:21] There's a dedicated userright for PageTriage, some 7-800 users have it. That's about 1% of users I think. But that again is not a very useful way of measuring importance - I don't think we should consider e.g. Checkuser mostly unused, even though there are much much less checkuser actions then e.g. edits. (re @Jan_ainali: Are there available stats on where content first pops up? [16:30:23] From a personal point of view, it seems like most articles on Swedish ...) [16:32:38] I think checkuser is a good comparison as it is probably used by 100% of the users who have access to it, clearly illustrating the difference of what good usage stats is. (re @gtisza: I'm not aware of such stats. You could probably look at where Wikidata properties are imported from most often. (My recollection...) [16:35:22] (There are 250M Wikipedia pages in total. So 90-95% of Wikipedia articles recreates content that already exists in another language. There isn't any way to measure which edits add original content vs. content that already exists elsewhere, but the largest few wikis would probably dominate such a metric even more than they dominates in edits generally.) [16:35:45] My source for the 20M: https://commons.wikimedia.org/wiki/File:Wikidata_Day_NYC_2022_-_Wikifunctions_and_Abstract_Wikipedia.pdf?page=14 [16:45:14] That's the consequence of functionary inactivity policies, a very arbitrary choice that doesn't really say anything about the extension's importance. The new pages patroller right does have an inactivity policy, I don't know how much it's enforced. But while you'd of course want to know how many active users an extension has, I don't think it makes sense in general to say that an [16:45:15] extension that's accessible by 100 users and all 100 use it is more important than an extension that's accessible by 1000 users but only 100 use it. The latter might indicate a UX problem (or might not; everyone can use the template sandbox for example but most people don't, that's totally fine). But that's a different issue than how much we are relying on the extension for our [16:45:17] content curation or distribution workflows. (re @Jan_ainali: I think checkuser is a good comparison as it is probably used by 100% of the users who have access to it, clearly illustrating t...) [18:19:16] templatesandbox not being used as much is actually a UX problem: it only works on preview for the main template or on template sandboxes in userspace, while most template development is done in subpages [18:59:42] What is hemostasis [19:17:04] [[w:en:Hemostasis]]? (re @Debeli Birhanu: What is hemostasis)