[07:34:37] Solving the timeout issues instead of trying to create ugly hacks? (re @u99of9: How should we put the spaces in to separate sentences? I don't want to combine these into a paragraph call, because that will t...) [07:37:04] I'm thinking the orchestrator needs to be improved so it can recognize that a sentence in a paragraph had already been rendered and cached so it doesn't have to render all sentences from scratch. [07:37:05] I'm wildly guessing here, but that would be the first thing I would explore to improve the issues with paragraphs. [07:40:04] Some of my sentences are always going to be costly. To me they are usually the logical unit to bundle. I'll never expect the system to render an entire article or even paragraph within the timeout. In fact I quite like the current gradual rendering of partial AW articles. (Although I wish most of them were pulled from cache hits.) (re @Npriskorn: I suggest solving the [07:40:04] timeout iss [07:40:05] ues instead of trying to create hacks that are hard to maintain. [07:40:07] WDYT?) [07:42:34] That would be nice (though I think it comes after cache hits from complete calls). But there would still be situations (e.g. after a label change) where the whole paragraph would need to be recalculated, and I don't want that to time out either. (re @Npriskorn: I'm thinking the orchestrator needs to be improved so it can recognize that a sentence in a paragraph has [07:42:34] already been rendered ...) [08:25:44] What I propose here is punctuation with spacing. [08:25:46] For example, [08:25:47] "full stop", English --> ". ". [08:25:49] "colon", French --> " : " [08:25:50] "colon", English --> ": ". [08:25:52] You can see the difference here between French and English. (re @u99of9: I guess this is what you mean. Yes, language configuration of symbols may prove useful. But my main question is should an HTML f...) [08:27:11] Of course. We need a community consensus on this. I provided my references here so that NLG People can discuss it. I do not have any updates. Let us see what happens. [08:27:40] Then I take it you favour including a trailing space in an English sentence construction? (re @Csisc1994: What I propose here is punctuation with spacing. [08:27:41] For example, [08:27:43] "full stop", English --> ". ". [08:27:44] "colon", French --> " : " [08:27:46] "colon", ...) [08:28:57] Yes, the advantage of my proposal is that this can cover languages where full stop is not followed by spaces. It is all resolved in the function. (re @u99of9: Then I take it you favour including a trailing space in an English sentence construction?) [08:31:03] Punctuation in most of the production LLMs is always transcribed according to English Conventions. We can customize it by language. [08:31:50] It's not too hard to write a sentence constructor for each set of languages. It will be faster than going by QID, but sure, your method can be used in a general composition to compete with a language configuration. (re @Csisc1994: Yes, the advantage of my proposal is that this can cover languages where full stop is not followed by spaces. It is all resolved...) [08:33:48] The technical error in English is that the last sentence of a paragraph should not have a trailing space. (re @Csisc1994: Yes, the advantage of my proposal is that this can cover languages where full stop is not followed by spaces. It is all resolved...) [08:34:06] That is accurate. (re @u99of9: The technical error in English is that the last sentence of a paragraph should not have a trailing space.) [08:34:37] That is why I did not implement it yet. I have to figure out how to solve this problem. [08:37:02] One option would be to wrap the final sentence in a paragraph-ending trimmer. (re @Csisc1994: That is why I did not implement it yet. I have to figure out how to solve this problem.) [08:40:41] That would be a lot fewer function calls than including spacers in between all sentences. [09:15:10] Another problem is that references go immediately after the full stop. Since they are generated by different functions, a sentence-reference pairing function would also have to move the trailing space. To me this is too problematic. So now I'm back to favouring sentence spacers everywhere in AW :-( (re @u99of9: The technical error in English is that the last sentence [09:15:10] of a paragra [09:15:11] ph should not have a trailing space.) [12:21:16] I’m reluctant to see sentence structure made explicit in abstract content. But once we fragment content units, for whatever reason, we can’t really avoid the need for a connector function between successive fragments. That function needs to know what sort of things it’s joining, and we shouldn’t assume that that is straightforward, but, yes, the English sentence-to-senten [12:21:16] [12:21:17] ce connector should normally be realised as a space. [12:21:19] It may similarly be that the sentence-to-reference connector should be realised as a non-breaking space before the note-marker, and that the resulting unit is treated, where relevant, either as a sentence or as a sentence-with-note-marker. (Note, for example, that the note-markers go before the 。 on jawiki, so English and Japanese “agree” that sentence-with-note-marker [12:21:19] can [12:21:20] be connected in the same way as a sentence.) [12:21:22] The crucial point is that fragmentation implies a shared framework constraint, rather than an explicit realization or even (necessarily) an explicit declaration. Then each content unit can be realized as a sentence, a list item or part of a sentence, according to the common evaluation (repeated re-evaluation, that is) of the framework constraint. [12:21:23] Just to be clear, I’m not sure we want to go down this path, but I think it is an unavoidable consequence of the current architecture. (re @u99of9: Another problem is that references go immediately after the full stop. Since they are generated by different functions, a senten...) [12:24:21] I think that this is a huge thing to be discussed in online meetings or here. I think that an in-person meeting is needed in Paris during summer. (re @Al: I’m reluctant to see sentence structure made explicit in abstract content. But once we fragment content units, for whatever reas...) [12:27:17] The question is how far we want to go. For example, we can code a sentence for every Wikidata property. This is simple to do. But, it will cause the generation of an overwhelming number of functions. We can create a universal language that can be converted to any natural language. But, this will require a lot of commitment. It is up to the community to decide it. [12:27:56] I don’t plan to be there, but I agree that plenty of discussion is required. (re @Csisc1994: I think that this is a huge thing to be discussed in online meetings or here. I think that an in-person meeting is needed in Par...) [13:00:07] I doubt it is as simple as it seems, but some sort of mapping from Wikidata property types does seem necessary. It is the assumption that the result is a ‘sentence’ or even a ‘sentence pattern’ that I find problematic. A property type characterizes a relation, and natural languages typically offer multiple ways of expressing any given relation. The chosen form depends les [13:00:07] [13:00:08] s on the individual relation itself than on the constellation of relations in which it appears, if I may generalize perhaps too much. (re @Csisc1994: The question is how far we want to go. For example, we can code a sentence for every Wikidata property. This is simple to do. Bu...) [13:02:45] I understood this apart from "The crucial point..." !! (re @Al: I’m reluctant to see sentence structure made explicit in abstract content. But once we fragment content units, for whatever reas...) [13:13:43] Ah, sorry about that. What I’m driving at is that [paragraph from sentences] (for example) is not a good abstraction in the first place. But it is coherent, if you go on to fragment the paragraph, that the paragraph fragments share the sentence-within-paragraph framework constraint. Then [sub-paragraph from sentences] becomes a coherent propagation of a sub-optimal [13:13:43] abstraction [13:13:44] when applied to the constituent fragments. (re @u99of9: I understood this apart from "The crucial point..." !!) [13:15:30] I agree with this in general, but rather than compute some context from the constellation surrounding the 'sentence' (which sounds too hard), I still prefer that the AW writer is provided an optional way of supplying the context to influence the form they would like to choose. Even if some of those options only meaningfully impact one language at a time, they can be [13:15:30] added to by o [13:15:31] ther editors who can shape how the abstract takes form in their language. The default may be a full-form restatement of triples, but it would be tuneable and connectable via context flags. Hopefully some of those context flags would be useful in multiple languages, so that not every sentence needs to be annotated in every language to make it sound better (which would [13:15:31] essentially [13:15:32] return us to writing language wikipedias). (re @Al: I doubt it is as simple as it seems, but some sort of mapping from Wikidata property types does seem necessary. It is the assump...) [13:46:51] Yes, it depends what you mean by “compute”. If, for example, some function outputs an inline or bulleted list according to the length of the list, the result depends on the length. But that length can be provided directly by the contributor or computed from the number of relevant statements. If it turns out that the list is too long to evaluate, the framework constraint [13:46:51] is st [13:46:52] ill provided from the overall length argument, not the length of each sublist (fragment). That is why the “common evaluation” is in practice a “repeated re-evaluation” (with the same result for each fragment). (re @u99of9: I agree with this in general, but rather than compute some context from the constellation surrounding the 'sentence' (which soun...) [18:51:55] I've been wondering. Can i use wikifunctions to generate localisation files for my browser extension? [18:51:56] https://github.com/fuddl/wikibase-for-web/blob/main/_locales/en/messages.json [19:06:25] Interesting idea! And kind of creates another notability question. Do we want people to do this for any arbitrary app or website? (re @Shi: I've been wondering. Can i use wikifunctions to generate localisation files for my browser extension? [19:06:26] https://github.com/fuddl/w...) [19:42:33] I wrote a python package for this purpose, https://pypi.org/project/wikifunctions/ (re @Shi: I've been wondering. Can i use wikifunctions to generate localisation files for my browser extension? [19:42:34] https://github.com/fuddl/w...) [19:42:58] It allows you to make calls to WF pretty easily [19:57:31] I guess it’s a bit of a grey area, in the abstract. Notability isn’t really an applicable criterion for functions. Any function that seeks to support the projects somehow would be fine in principle. Conversely, any function that seems to exist for purely private purposes is doubtful, although we have no explicit policy that applies here. But the scope of Wikifunctions is [19:57:31] exte [19:57:32] nded to “…functions to support the Wikimedia projects and beyond…”, and “beyond” is nowhere qualified (yet). (re @Jan_ainali: Interesting idea! And kind of creates another notability question. Do we want people to do this for any arbitrary app or website...) [21:24:04] This draft is relevant: https://www.wikifunctions.org/wiki/Wikifunctions:Valuable (re @Al: I guess it’s a bit of a grey area, in the abstract. Notability isn’t really an applicable criterion for functions. Any function ...) [21:42:05] Indeed. I’m not sure it should become policy, but “valuable” makes more sense than “notable”, however we define it. I’d probably prefer “useful”, still distinguishing between “useful for the projects” and “generally useful”, with a lower utility threshold for functions that directly support the projects (always passing while the function is in fact in use) message> [21:42:05] . (re @u99of9: This draft is relevant: https://www.wikifunctions.org/wiki/Wikifunctions:Valuable)