[00:41:52] Need some help on choices of lexicographical IR ... the background is simply this: [00:41:53] I have data from several lexicon's of Cajun language. Some of them map to French (Francais) language. And all of the lexicons map to English language. [00:41:54] > faire (v.t.) 1. to make. 2. to do. Il fait, ça fait + weather term.The weather is-----. Ça fait chaud. (The weather is hot.) Il fait beau. (The weather is nice.) Ça fait (du) soleil. (It's sunny.) [00:41:55] Ça fait = It is , or It's, or it's a fact that [00:41:57] This combines the similar pronunciation and potentially some of the Senses (but not always for Cajun, since it's kind of a sloppy loose slang French 😉 an my ancestral language) that would be found on both French lexemes: [00:41:59] https://www.wikidata.org/wiki/Lexeme:L24349 [00:42:00] https://www.wikidata.org/wiki/Lexeme:L12842 [00:42:02] Rather then shoehorn the lexicon's into Wikidata Lexeme... [00:42:03] I'm thinking of just storing in an IR or intermediate language with the idea of Wikifunctions eventually allowing me to map through this IR. [00:42:05] There's +'s and -'s (pro's and con's) to that idea however as I'm well aware, but still...wondering what others think of how to externally store mappings of translations that you may not want to leverage Wikidata Lexeme namespace just yet. [00:42:06] The only thing close to a decent IR that I have seen is utilizing MOLTO tools from Grammatical Framework ? [00:42:08] http://www.molto-project.eu/sites/default/files/MOLTO_D2.3.pdf#subsection.5.4 [00:42:09] I found "ça" in the RGL https://github.com/GrammaticalFramework/gf-rgl/search?q=%C3%A7a [00:42:10] https://github.com/GrammaticalFramework/gf-rgl/blob/master/src/french/ExtraFre.gf#L60 [00:43:53] Need some help on choices of lexicographical IR ... the background is simply this: [00:43:54] I have data from several lexicon's of Cajun language. Some of them map to French (Francais) language. And all of the lexicons map to English language. [00:43:56] > faire (v.t.) 1. to make. 2. to do. Il fait, ça fait + weather term.The weather is-----. Ça fait chaud. (The weather is hot.) Il fait beau. (The weather is nice.) Ça fait (du) soleil. (It's sunny.) [00:43:57] Ça fait = It is , or It's, or it's a fact that [00:43:59] This combines the similar pronunciation and potentially some of the Senses (but not always for Cajun, since it's kind of a sloppy loose slang French 😉 and my ancestral language) that would be found on both French lexemes: [00:44:00] https://www.wikidata.org/wiki/Lexeme:L24349 [00:44:02] https://www.wikidata.org/wiki/Lexeme:L12842 [00:44:03] Rather then shoehorn the lexicon's into Wikidata Lexeme... [00:44:05] I'm thinking of just storing in an IR or intermediate language with the idea of Wikifunctions eventually allowing me to map through this IR. [00:44:06] There's +'s and -'s (pro's and con's) to that idea however as I'm well aware, but still...wondering what others think of how to externally store mappings of translations that you may not want to leverage Wikidata Lexeme namespace just yet. [00:44:08] The only thing close to a decent IR that I have seen is utilizing MOLTO tools from Grammatical Framework ? [00:44:09] http://www.molto-project.eu/sites/default/files/MOLTO_D2.3.pdf#subsection.5.4 [00:44:11] I found "ça" in the RGL https://github.com/GrammaticalFramework/gf-rgl/search?q=%C3%A7a [00:44:12] https://github.com/GrammaticalFramework/gf-rgl/blob/master/src/french/ExtraFre.gf#L60 [00:44:14] What issues do you envision encountering if you were to "shoehorn" information about Cajun French into lexemes? (re @thadguidry: Need some help on choices of lexicographical IR ... the background is simply this: [00:44:15] I have data from several lexicon's of Cajun language. Some of them map to French (Francais) language. And all of the lexicons map to English language. [00:44:17] > faire (v.t.) 1. to make. 2. to do. Il fait, ça fait + weather term.The weather is-----. Ça fait chaud. (The weather is hot.) Il fait beau. (The weather is nice.) Ça fait (du) soleil. (It's sunny.) [00:44:18] Ça fait = It is , or It's, or it's a fact that [00:44:20] This combines the similar pronunciation and potentially some of the Senses (but not always for Cajun, since it's kind of a sloppy loose slang French 😉 and my ancestral language) that would be found on both French lexemes: [00:44:21] https://www.wikidata.org/wiki/Lexeme:L24349 [00:44:23] https://www.wikidata.org/wiki/Lexeme:L12842 [00:44:24] Rather then shoehorn the lexicon's into Wikidata Lexeme... [00:44:26] I'm thinking of just storing in an IR or intermediate language with the idea of Wikifunctions eventually allowing me to map through this IR. [00:44:27] There's +'s and -'s (pro's and con's) to that idea however as I'm well aware, but still...wondering what others think of how to externally store mappings of translations that you may not want to leverage Wikidata Lexeme namespace just yet. [00:44:29] The only thing close to a decent IR that I have seen is utilizing MOLTO tools from Grammatical Framework ? [00:44:30] http://www.molto-project.eu/sites/default/files/MOLTO_D2.3.pdf#subsection.5.4 [00:44:32] I found "ça" in the RGL https://github.com/GrammaticalFramework/gf-rgl/search?q=%C3%A7a [00:44:33] https://github.com/GrammaticalFramework/gf-rgl/blob/master/src/french/ExtraFre.gf#L60) [00:45:12] The Senses... that's the secret sauce for Wikidata Lexeme's no matter what, right? Without that, pretty much useless for lots of nicer use cases. [00:45:38] Sorry, what about the senses is a problem? (re @thadguidry: The Senses... that's the secret sauce for Wikidata Lexeme's no matter what, right? Without that, pretty much useless for lots of nicer use cases.) [00:46:01] (Not sure what couldn't be added to the lexemes for "ça" and "faire") [00:46:36] (or even "ça faire" if that is in fact an expression separately definable from its components or something) [00:47:04] Even if a gloss is copyrighted, the meaning which that gloss is intended to signify surely isn't [00:47:42] It's a Lexeme problem...not a Sense problem. [00:47:42] Maps would always need to be at a Sense level. [00:47:58] Right, and? (re @thadguidry: It's a Lexeme problem...not a Sense problem. [00:47:59] Maps would always need to be at a Sense level.) [00:48:26] I.E. I wouldn't map L24349 to some new Cajun language Lxxxxx [00:48:59] I'd map the Senses. We discussed this as a best practice before, no? [00:49:00] You wouldn't, or you can't? (you could certainly propose a new property if you can't) (re @thadguidry: I.E. I wouldn't map L24349 to some new Cajun language Lxxxxx) [00:49:45] Yes, you certainly would. So why aren't you keen on mapping French senses to their Cajun equivalents? What drives you to consider something else? (re @thadguidry: I'd map the Senses. We discussed this as a best practice before, no?) [00:50:20] as you can see... hardly any Senses. 😊 [00:51:01] ? these words certainly *mean* something, no? you'd make connections based on these meanings (re @thadguidry: as you can see... hardly any Senses. 😊) [00:51:41] you're not getting it... 9 out 10 of the French lexeme's I quickly inspected (about 20) have 0 zero Senses. [00:52:29] right, and what is preventing you from adding senses to them? (or from asking someone to help you with them)? (re @thadguidry: you're not getting it... 9 out 10 of the French lexeme's I quickly inspected (about 20) have 0 zero Senses.) [00:52:35] So I'm thinking of just using GF and RGL to apply quick mappings of the pronouns, verbs, etc.etc. using MOLTO and Eclipse, etc. [00:53:56] Ah, I'm getting your subtle point... just push my lexicons over to WDL, create new Lexeme's for Cajun language as needed (probably all needed)... and mapping to French Senses can come later? [00:54:17] Yes! (re @thadguidry: Ah, I'm getting your subtle point... just push my lexicons over to WDL, create new Lexeme's for Cajun language as needed (probably all needed)... and mapping to French Senses can come later?) [00:54:43] well, the reason I didn't want to do that... is because 3 months ago you said not to basically. 😊 [00:55:01] Make your Cajun lexemes, and link them to Wikidata items via p5137 and English senses via p5972! [00:55:14] please pardon my newfound amnesia, but when did I say that? (re @thadguidry: well, the reason I didn't want to do that... is because 3 months ago you said not to basically. 😊) [00:55:23] Make your Cajun lexemes with senses, and link them to Wikidata items via p5137 and English senses via p5972! [00:55:41] halfway joking, but you did basically say that it wasn't a good idea...and then I think it was Nikki or Jan that said not to do it yet. [00:55:48] if you're referring to making lexemes without senses, yes don't do that [00:56:21] but since you make a claim to having some knowledge of this languages, surely adding senses to the lexemes you create is possible? [00:56:34] but since you make a claim to having some knowledge of this language, surely adding senses to the lexemes you create is possible? [00:56:36] exactly... but this lexicon does have all the Senses...so I'd be good to go to load into WDL? [00:57:16] if the glosses can be added to Wikidata, sure go right ahead; if not, you will need to do some paraphrasing (re @thadguidry: exactly... but this lexicon does have all the Senses...so I'd be good to go to load into WDL?) [00:57:28] Unfortunately, it's a dying language... the whole reason for this effort is to help save a portion of it's history. [00:58:15] I'm sure linkages to the world of Wikidata lexicographical data will help them then (re @thadguidry: Unfortunately, it's a dying language... the whole reason for this effort is to help save a portion of it's history.) [01:02:55] Anyways, thanks for the heads up and OK to push over as long as I have the Senses. I still think that for dying languages (at least the ones that might have some written form) is not well supported enough. But there's nothing that Unicode can do to help them, so it takes folks like SIL to help at least a little. https://www.sil.org/about/endangered-languages [01:04:41] Here's mine: https://www.ethnologue.com/language/frc [01:06:22] My plan to to get most of the lexicon into WDL by Christmas. (a present to the world) [01:08:34] And then with the availability of the Constructors and Renderers, well, frc might just have a few Wikipedia pages finally appear. 😊