[10:45:16] In the context of my query on the Unicode normalization of the Arabic Script, I found out that Unicode recommends the use of NFD instead of NFC for Arabic. [10:45:53] https://www.unicode.org/reports/tr53/tr53-3.html [10:48:27] although that page says it's a draft and doesn't imply endorsement by unicode [10:50:51] Yes, I know this. However, when I asked people working on Arabic NLP, they redirected me to this page. (re @Nikki: although that page says it's a draft and doesn't imply endorsement by unicode) [10:51:47] I asked people from QCRI (Qatar), New York University Abu Dhabi (UAE) and University of Jordan (Jordan). [10:54:07] I'm not saying there's anything wrong with it, I'm just saying "Unicode recommends" doesn't seem to be true [10:54:25] +1 (re @Nikki: I'm not saying there's anything wrong with it, I'm just saying "Unicode recommends" doesn't seem to be true) [10:55:22] That is quite true. But, this is what Arabic NLP Specialists have been saying. It is absolutely wrong. But, this is just what they are using. [18:39:53] Z10222 (https://notwikilambda.toolforge.org/wiki/Z10222) lacks a Z10222K1, so I can't actually run this function at all, and attempts to rectify this are failing. How can I edit the ZObject JSON manually? [18:42:46] Even attempting an edit using the action API is failing with "Direct editing via API is not supported for content model zobject used by Z10222." smh [21:25:20] Newsletter Nr 53: What license should we use? - https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-11-22 [21:47:55] Gut reaction is that output should not be CC BY-SA, but will have to think about it a bit before writing it on the talk page. :) [21:48:24] yeah, I was expecting cc0 [21:49:11] Who should be attributed for it? If I write something and can only check it in Swedish and English, should I be attributed for output in all languages? [21:56:54] and I'm not keen on any license for the functions that has requirements for using it [21:57:05] Make one edit in the renderer of a language (fix a typo in a comment perhaps) and get attributed in all output of that language? [21:58:45] agreed (re @Jan_ainali: Gut reaction is that output should not be CC BY-SA, but will have to think about it a bit before writing it on the talk page. :)) [22:19:48] The abstract content would be the primary content licensed under CC BY SA. It would be the authors if the abstract content that are attributed. [22:20:33] In the same way, if someone were taking PD abstract content, the output would be PD too. Or proprietary content, the output would be proprietary. [22:21:07] If you run a copyrighted picture through a Photoshop filter, the filter doesn't get copyright on the output. [22:33:19] And if I run a image filter over copyrighted content, in a way that makes the picture essentially unrecognizable from the original, who gets the copyright of the new image? : https://tools-static.wmflabs.org/bridgebot/236889bc/file_9411.jpg [22:33:38] And if I run some image filters over copyrighted content, in a way that makes the picture essentially unrecognizable from the original, who gets the copyright of the new image? : https://tools-static.wmflabs.org/bridgebot/236889bc/file_9411.jpg [22:38:34] I found that the problem of the double diacritics for Arabic Languages is old. [22:38:37] https://phabricator.wikimedia.org/T23429 [22:38:38] Uh, that's fascinating! I'll take a closer look to see what bugs to file about this, thanks (re @mahir256: Z10222 (https://notwikilambda.toolforge.org/wiki/Z10222) lacks a Z10222K1, so I can't actually run this function at all, and attempts to rectify this are failing. How can I edit the ZObject JSON manually?) [22:39:43] But it's not unrecognizable. It's a very tightly guided test generation, similar to a translation, no? (re @Jan_ainali: ) [22:40:54] Do you want me to turn the crank to max to make it unrecognizable? :D [22:45:42] Original: https://meta.wikimedia.org/wiki/File:Wikifunction_architecture_for_text_generation.png(7 filters applied) : https://tools-static.wmflabs.org/bridgebot/5966a7f7/file_9412.jpg [22:52:35] And sure in the example of [22:52:37] Superlative( [22:52:38] subject: Jupiter, [22:52:40] quality: large, [22:52:41] class: planet, [22:52:43] location constraint: Solar System) [22:52:44] you can actually recognize some letters if you render it in Croatian. But if you render it in language with a different script it would be unrecognizable. [22:52:46] Sure if you know how the "filters" work, it's perhaps deducable, but in theory, there must be examples of different Abstract content that produces the same output in some languages. [22:53:13] :D I meant, the goal of the renderer functions is not to make it an unrecognizable translation, but to be rather faithful to the abstract content [22:53:23] (I'm not really sure what I'm arguing for or against here :) ) [22:55:45] But that's like saying by translating Adichie's books into Chinese she wouldn't have a copyright in the result anymore because it's unrecognizable for me or you, no? [22:57:29] I think my point is that if it is truly abstract content, there can not be any literary copyright in it, but only the idea of a concept, which cannot be copyrighted [22:58:04] is this semi-abstract Wikipedia? ;) [22:59:18] Ah, so the question is whether the abstract content can be copyrighted at all? [22:59:53] Yes, but I arrived there from the output before realizing this was my question! [23:01:52] if code is copyrightable then surely abstract content is copyrightable as well [23:03:07] Yes, the form of the abstract content, but not the content of it! (Our terms are getting confusing now) [23:08:41] An idea cannot be copyrighted, but if you express that idea in any language, the form you captured it in can be copyrighted. [23:08:41] But since the idea here is to capture the idea in an abstract way, without language, one cannot claim copyright on whatever forms gets generated from the idea as such. [23:11:31] We are not generating output by following the form of the Abstract content, but from the idea it represents. At least that's the hypothesis I put forward. [23:16:03] First example of it, I hope these two different forms of Abstract content would generate the same output, ie. the exact form is not important: [23:16:04] 1. [23:16:05] Superlative( [23:16:07] subject: Jupiter, [23:16:08] quality: large, [23:16:10] class: planet, [23:16:11] location constraint: Solar System) [23:16:13] 2. [23:16:14] Superlative( [23:16:16] subject: Jupiter, [23:16:17] location constraint: Solar System, [23:16:19] class: planet, [23:16:20] quality: large) [23:17:36] In Ninai, if you permute the arguments to an "Action()" constructor -- see https://gitlab.com/mahir256/ninai/-/blob/main/demonstrations.md for examples -- the output will be the same (re @Jan_ainali: First example of it, I hope these two different forms of Abstract content would generate the same output, ie. the exact form is not important: [23:17:37] 1. [23:17:38] Superlative( [23:17:40] subject: Jupiter, [23:17:41] quality: large, [23:17:43] class: planet, [23:17:44] location constraint: Solar System) [23:17:46] 2. [23:17:47] Superlative( [23:17:49] subject: Jupiter, [23:17:50] location constraint: Solar System, [23:17:52] class: planet, [23:17:53] quality: large)) [23:18:13] (I see that I need to more aggressively promote the dissemination of concrete examples) [23:21:49] As a matter of fact, with as few requirements for argument placement as possible, I'd like to see the inputs of *any* constructor be freely permutable [23:25:37] on the talk page, I wonder if it would make sense to split the support recommendations by content category? I can already support CC0 for function signatures, but I’ll have to think more about the other categories [23:42:15] I’ve started translating the page; should the “please comment on the talk page” translation have a stance on whether comments in languages other than English are welcomed, tolerated, discouraged? [23:46:53] An idea cannot be copyrighted, but if you express that idea in any language, the form you captured it in can be copyrighted. [23:46:55] But since the purpose here is to capture the idea in an abstract way, without language, one cannot claim copyright on whatever forms gets generated from the idea as such. [23:48:01] @lucaswerkmeister Re: comments in any language - yes. I've added a new string for that at the bottom of the main page. [23:48:26] Re: talkpage headings - any tweaks you think will help, are welcome. :) [23:57:45] Good idea! (re @lucaswerkmeister: on the talk page, I wonder if it would make sense to split the support recommendations by content category? I can already support CC0 for function signatures, but I’ll have to think more about the other categories)