[03:31:48] However indeed I like instant messaging channels much much better [03:32:07] Project chat isn’t an efficient place [04:41:27] I've created [[Template:Function review candidate]] for this. It likely needs improvements. For the nominator it looks like {{Function review candidate|Zxxxxx|message ~~}} for the first nom and {{Function review candidate|Zxxxxx|message |n}} for following noms. For the reviewer it looks like {{Function review candidate|Zxxxxx|message |n|Doing|name}} when starting [04:41:27] and {{Function r [04:41:28] eview candidate|Zxxxxx|message ~~|n|Done|name|result}} with a full message in a reply. Hopefully should streamline the process and make it more consistent if we do decide to go with this system (re @MolecularPilot: I think a 2-tier system like they have on commons and you proposed would be perfect, I just put down some thoughts about the pro...) [05:01:58] Example on talk page (re @Feeglgeef: I've created [[Template:Function review candidate]] for this. It likely needs improvements. For the nominator it looks like {{Fu...) [06:47:23] Similarly I've created [[Wikifunctions:Function_of_the_Week/submissions]] so that we can draft suggestions for FOTW. No guarantees that they will be picked by editors! [07:43:05] @vrandecic Concerning your question about a or an, there is a rule. [07:43:52] Sure, but the rule is based on pronunciation, which isn't derivable from the written word, and we usually only have the written word available [07:43:54] In fact, choosing between a and an depends on how the noun after it is pronounced and not how the noun is written. [07:44:28] Yes, there are several rule-based English to IPA Script Converter. (re @vrandecic: Sure, but the rule is based on pronunciation, which isn't derivable from the written word, and we usually only have the written ...) [07:46:14] Which was partly due to me not looking enough at the Project Chat. That's why I am sayin that my goal is to have that higher on my radar this year. (re @hogue_456: I wanted to discuss how to proceed with Abstract Wikipedia after reading a newsletter. This is the section where I asked. https:...) [07:47:10] https://github.com/rhasspy/gruut-ipa/tree/master/gruut_ipa is the code for English to IPA. [07:47:11] Oh, we should have that as a function! (re @Csisc1994: Yes, there are several rule-based English to IPA Script Converters.) [07:47:41] We have that function for Tunisian in Wikifunctions. It can be implemented for English. (re @vrandecic: Oh, we should have that as a function!) [07:51:22] https://github.com/vpnry/english-to-ipa/blob/main/to-ipa.py is a better code that can be easily implemented in Wikifunctions. [07:54:50] That... is just calling another library... [08:04:15] Sorry. The right code. (re @vrandecic: That... is just calling another library...) [08:04:52] https://tools-static.wmflabs.org/bridgebot/90608e62/rhymes.py [08:04:52] https://tools-static.wmflabs.org/bridgebot/bd8500f4/stress.py [08:04:54] https://tools-static.wmflabs.org/bridgebot/da07f3a4/syllables.py [08:05:03] https://tools-static.wmflabs.org/bridgebot/77f6c442/transcribe.py [08:05:11] https://tools-static.wmflabs.org/bridgebot/f4ef764e/transcriber.py [08:05:17] stop, stop, we don't need to send [08:05:25] a whole python library through the telgram chat [08:05:54] This is not the whole library. This is only the five files including the transcription. (re @vrandecic: a whole python library through the telgram chat) [08:06:08] There are other files. [08:06:32] We can summarize all of it in one function. [08:06:44] that would be great [08:09:43] For a vs an, according to the code, a beginning vowel sounds always occurs with [a, e, o, i]. (re @vrandecic: that would be great) [08:10:12] The problem is with abbreviations and [u, w, y]. [08:11:11] Abbreviations beginning with [A, E, I, L, M, N, O, R, S, X] have an. [08:12:30] We can also directly implement that in an a or an function. [08:13:47] Here, all uppercase words are comsidered abbreviations. This is not always true. (re @Csisc1994: Abbreviations beginning with [A, E, I, L, M, N, O, R, S, X] have an.) [08:14:08] We can do better. [08:18:20] I processed the codes using Copilot and this is what it generated. [08:18:24] def choose_article(word): [08:18:25] vowels = ['a', 'e', 'i', 'o'] [08:18:27] letters_for_an = ['A', 'E', 'I', 'L', 'M', 'N', 'O', 'R', 'S', 'X'] [08:18:28] # Check the first letter of the word [08:18:30] first_letter = word[0] [08:18:31] # Check if the word is an abbreviation [08:18:33] if len(word) == 1 or (word[0].isupper() and word[1].isupper()): [08:18:34] if first_letter in letters_for_an: [08:18:36] return "an" [08:18:37] else: [08:18:39] return "a" [08:18:40] # Check for regular words [08:18:42] if first_letter in vowels: [08:18:43] return "an" [08:18:45] else: [08:18:46] return "a" [08:18:48] # Examples [08:18:49] print(choose_article("apple")) # Output: an [08:18:51] print(choose_article("banana")) # Output: a [08:18:53] print(choose_article("MRI")) # Output: an [08:18:54] print(choose_article("USB")) # Output: a [08:18:55] What do you think @vrandecic. [08:18:57] let's keep that off the chat, this is better suited to do on-wiki [08:19:41] I will implement the function after solving the U problem. Then, we will go from this. [09:10:34] Just a brief question. [09:10:57] How to state that I adapted the code from a Python Package. [09:19:10] eu– and ew– imply /j/ implies “a” (re @Csisc1994: For a vs an, according to the code, a beginning vowel sounds always occurs with [a, e, o, i].) [09:21:20] I see. I will be implementing this. (re @Al: eu– and ew– imply /j/ implies “a”) [09:21:23] Thank you. [09:23:30] Ewok will fail :) (re @Al: eu– and ew– imply /j/ implies “a”) [09:25:10] I can see many words were this would fail (including - but not limited to - homographs that are not homonyms) (re @Toby: Ewok will fail :)) [09:29:22] I can’t think of any, but they would be exceptions to the rule 🤷‍♂️ (re @Nicolas: I can see many words were this would fail (including - but not limited to - homographs that are not homonyms)) [09:31:46] not relevant for the article but very common, lead can be 'liːd' or 'lɛd' [09:31:46] very rare but relevant here : unionized (union+ized or un+ionized), an unionized worker but a unionized particle (re @Al: I can’t think of any, but they would be exceptions to the rule 🤷‍♂️) [09:32:45] Ah, yes… absolutely! (re @Nicolas: not relevant for the article but very common, lead can be 'liːd' or 'lɛd' [09:32:46] very rare but relevant here : unionized (union+ized or...) [09:34:52] (it would be “a unionized worker” but “an unionized particle”… but u– is just a hard case.) (re @Nicolas: not relevant for the article but very common, lead can be 'liːd' or 'lɛd' [09:34:52] very rare but relevant here : unionized (union+ized or...) [09:36:09] oops, corrected, thanks (re @Al: (it would be “a unionized worker” but “an unionized particle”… but u– is just a hard case.)) [09:40:04] anyway, prounciation is complicated and not always logical [09:40:04] that said it shouldn't stop us to build a function (with some caveat), in most cases it's still a good approximation [09:41:17] I'm thinking about doing the same for French "elision" : la + école = l'école [09:41:18] (but again there is traps, like aspirated 'h', le héros but l'héroïne...) [10:18:26] I would say that is definitely an on-wiki question. Basically, it depends… and it is the contributor’s responsibility to ensure that all requirements are met for release under the Apache 2.0 License. (re @Csisc1994: How to state that I adapted the code from a Python Package.) [10:31:52] I did a first try. Check Z21874. [10:42:01] @vrandecic What do you think. [10:51:18] about the code, yes, you have to make sure the licenses are compatible [10:53:58] Not the same code. Just inspired from the rules there. (re @vrandecic: about the code, yes, you have to make sure the licenses are compatible) [10:56:12] I developed the code from Scratch. [10:57:54] code conversion [11:08:19] why is it not part of Z21739 but its own function? [11:27:23] Missed from my side. I am very sorry. (re @vrandecic: why is it not part of Z21739 but its own function?) [11:27:40] However, the other code is quite obsolete. [11:28:10] There is only that "hour" exception that I will addd to my code. [11:28:49] But, I will to figure out the rule for the silent h before I proceed. [11:30:10] It’s not the only h– exception and the rule depends partly on stress: “a history” but “an historian”. (re @Csisc1994: There is only that "hour" exception that I will add to my code.) [11:44:42] But but but why is it all needed at all? Can't it be based on lexemes, which have IPA pronunciation properties, at least in theory? "unionized" would be two lexemes with different pronunciations. [11:45:11] Thanks. I still think that was a correct and important fix, however it hasn't solved the problem of the Python failure at Z21413. Because of the debug statement, it is clear that the python implementation returns "return -0.0" (which is the line after the debug). But if it actually gets into code conversion as -0.0, all my tests in demos say that it should get the right [11:45:12] answer. I [11:45:12] 've documented this at [[Talk:Z21413]]. It would help to have someone else check my analysis before I submit this as a phab bug. (re @vrandecic: I did the suggested change: https://www.wikifunctions.org/wiki/Z20885?uselang=en&diff=prev&oldid=159275) [11:48:19] true but [11:48:19] not everything is in the lexeme (especially pronunciations) (re @amire80: But but but why is it all needed at all? Can't it be based on lexemes, which have IPA pronunciation properties, at least in theo...) [11:48:48] we are also in the middle of the river, relying on Lexemes but needing to plan if there is no Lexemes :/ [11:51:11] “in theory”, as you say… and even if the pronunciation is available, it is useful to see whether it is regular (according to the rules specified for the function). (re @amire80: But but but why is it all needed at all? Can't it be based on lexemes, which have IPA pronunciation properties, at least in theo...) [11:54:59] Yes… and perhaps the collective focus for lexemes should be documenting exceptions? (re @Nicolas: we are also in the middle of the river, relying on Lexemes but needing to plan if there is no Lexemes :/) [11:57:59] not only but yes, exceptions are a priority (re @Al: Yes… and perhaps the collective focus for lexemes should be documenting exceptions?) [11:58:28] (and "exception" is fun for me speaking French, English and Breton, 3 highly irregular languages :P ) [12:02:08] which reminds me, I still need to finish the functions for Breton conjugation... [12:25:59] agreed. The function should basically check lexemes for excpetions, and only if there are non, use the regular function as a fallback and hope it works. And if it makes something wrong, people can go to the lexemes and add the exception. [12:33:26] For the time being, we still have the string-to-lexeme and homograph challenges, however. (re @vrandecic: agreed. The function should basically check lexemes for excpetions, and only if there are non, use the regular function as a fal...) [12:33:54] absolutely [12:42:20] yeah, string-to-lexeme seems almost impossible... [12:42:21] you have to look at the statements and the senses and even then, there could be pitfall (re @Al: For the time being, we still have the string-to-lexeme and homograph challenges, however.) [12:43:10] at least for generating sentences, it would make more sense to find the lexeme from the item, no? [12:48:23] As a very general rule, yes… (re @Nicolas: at least for generating sentences, it would make more sense to find the lexeme from the item, no?) [13:05:45] It's a historian (at least in 'murica) (re @Al: It’s not the only h– exception and the rule depends partly on stress: “a history” but “an historian”.) [13:07:05] Do you drop the h? [13:09:05] No. The traditional rule is that “an” is used before aspirated h when the syllable is unstressed. (re @Feeglgeef: Do you drop the h?) [13:11:05] I've always heard "a historian." I don't see a good reason to code an exception if it's the reasonable answer somewhere (re @Al: No. The traditional rule is that “an” is used before aspirated h when the syllable is unstressed.) [13:12:19] Of course, no objection if you do if it's just my area thing [13:16:45] 😏 Maybe I’m just showing my age and Britishness… What about “an horrendous…”? (re @Feeglgeef: I've always heard "a historian." I don't see a good reason to code an exception if it's the reasonable answer somewhere) [13:18:56] No, I'd always say it "a horrendous..." Perhaps we need to split varieties of English? One article using multiple varieties would really be annoying (re @Al: 😏 Maybe I’m just showing my age and Britishness… What about “an horrendous…”?) [13:19:57] Ah, yes… that would be the en.Wikipedia way… (re @Feeglgeef: No, I'd always say it "a horrendous..." Perhaps we need to split varieties of English? One article using multiple varieties woul...) [13:20:49] Btw, could we have a tool to merge functions? [13:20:50] The more time pass, the more it will be needed I think (re @vrandecic: why is it not part of Z21739 but its own function?) [13:21:38] A bot with a simple interface and admin could do that (re @Nicolas: Btw, could we have a tool to merge functions? [13:21:39] The more time pass, the more it will be needed I think) [13:22:13] Question is who to authorize [13:24:04] Could be yes [13:24:05] But I was more thinking about a simpler tool like the one to merge entities on Wikidata (no bot, no admin, just two parameters and a "merge" button) (re @Feeglgeef: A bot with a simple interface and admin could do that) [13:24:27] How would that work exactly (re @Nicolas: Could be yes [13:24:28] But I was more thinking about a simpler tool like the one to merge entities on Wikidata (no bot, no admin, just two...) [13:24:38] Would any functioneer be able to do it? [13:25:28] On Wikidata, everyone can do it, it works very well (re @Feeglgeef: Would any functioneer be able to do it?) [13:26:06] Uh, doesn't that mean anyone can delete anything they want? (re @Nicolas: On Wikidata, everyone can do it, it works very well) [13:27:43] Technically yes, but anyone can already empty any function [13:27:43] And it's merge, not delete (re @Feeglgeef: Uh, doesn't that mean anyone can delete anything they want?) [13:28:42] "anyone can already empty any function," no, they can delete at most about 10% of it (re @Nicolas: Technically yes, but anyone can already empty any function [13:28:42] And it's merge, not delete) [13:29:04] You can remove labels but nothing else [13:29:14] Unless you have more rights [13:29:47] Anyone with an account* (re @Feeglgeef: "anyone can already empty any function," no, they can delete at most about 10% of it) [13:29:59] You can remove the description, the code, etc. everything except the structure (re @Feeglgeef: You can remove labels but nothing else) [13:30:44] the code?? That requires functioneer when connected, so almost always (re @Nicolas: You can remove the description, the code, etc. everything except the structure) [13:30:56] The description is a label in the technical sense [13:39:36] Still, a merge tool will be needed soon enough [13:40:17] It can be done manually, I don't know why we'd need that (re @Nicolas: Still, a merge tool will be needed soon enough) [13:41:59] How do you do it manually? [13:42:00] And is it in more than one click? if so, a tool would be an improvement (re @Feeglgeef: It can be done manually, I don't know why we'd need that) [13:44:17] That's a curious one. I am going to file a bug, the solution to this isn't immediately obvious to me. (re @u99of9: Thanks. I still think that was a correct and important fix, however it hasn't solved the problem of the Python failure at Z21413...) [13:45:08] I created a parallel test that passes. It is still curious, however. (re @vrandecic: That's a curious one. I am going to file a bug, the solution to this isn't immediately obvious to me.) [13:45:26] By moving the labels and implementations (re @Nicolas: How do you do it manually? [13:45:27] And is it in more than one click? if so, a tool would be an improvement) [13:48:39] Oh, maybe the converter is fixed but has somehow used a cached result from before it was fixed? (re @Al: I created a parallel test that passes. It is still curious, however.) [13:49:46] That’s the curious bit 😏 (re @u99of9: Oh, maybe the converter is fixed but has somehow used a cached result from before it was fixed?) [13:51:35] That's not a merge at all, it breaks both the link (no redirect) and the license (re @u99of9: Oh, maybe the converter is fixed but has somehow used a cached result from before it was fixed?) [13:52:01] Where's the parallel test? One can usually poke the cache by making a change to the test (e.g. changing the label) (re @Al: I created a parallel test that passes. It is still curious, however.) [13:52:55] Z21883 (re @vrandecic: Where's the parallel test? One can usually poke the cache by making a change to the test (e.g. changing the label)) [13:55:42] T344973 [13:55:51] yep. I added a German label, and now Z21413 is passing [22:17:00] should we delete this one now that they are both working identically? (re @Al: Z21883) [22:18:18] Not opposed (re @u99of9: should we delete this one now that they are both working identically?) [22:20:05] Could do. I was going to change it to the smallest successful value, but I didn’t get it to pass 🤷‍♂️ (re @u99of9: should we delete this one now that they are both working identically?) [22:21:17] like this one? Z21408 (re @Al: Could do. I was going to change it to the smallest successful value, but I didn’t get it to pass 🤷‍♂️) [22:26:13] 😱 I thought 1022 was the maximum exponent… Yeah, go ahead and delete, then 👍 (re @u99of9: like this one? Z21408) [22:27:31] Well yes, if you only consider "normal" floats, then this is the minimum: Z21410 (re @Al: 😱 I thought 1022 was the maximum exponent… Yeah, go ahead and delete, then 👍) [22:32:15] Now you see why I'm so invested in getting every test working, even if the code converter was causing it. so... many... edge... cases! (re @u99of9: Well yes, if you only consider "normal" floats, then this is the minimum: Z21410) [22:40:08] Speaking of which. Z21527 fails in javascript because the converter to code Z19702 can't deal with (invalid) 0/0, because one of the first things it does is "const gcd = (a, b) => (b ? gcd(b, a % b) : a);". We could consider letting it pass a/0 through unsimplified which would give the implementations the opportunity to do the error handling (which Z20855 is already [22:40:08] ready to do). [22:40:09] Or we could put in a validator. Or both? (re @u99of9: Now you see why I'm so invested in getting every test working, even if the code converter was causing it. so... many... edge... ...) [22:43:18] Maybe this is a general issue we should document on wiki? I've copied here: [[Talk:Z21527]] (re @u99of9: Speaking of which. Z21527 fails in javascript because the converter to code Z19702 can't deal with (invalid) 0/0, because one of...) [22:45:25] Yeah, good plan! (re @u99of9: Maybe this is a general issue we should document on wiki? I've copied here: [[Talk:Z21527]]) [22:54:02] Optimal behavior (re @u99of9: Speaking of which. Z21527 fails in javascript because the converter to code Z19702 can't deal with (invalid) 0/0, because one of...)