Module talk:la-noun/data
Add topic2nd declension, Greek type, masculine
[edit]The Module created accusative -on and -um for a nominative in -os. To a nominative in -os properly only belongs an accusative in -on while the accusative in -um belongs to a noun in -us. There might be cases where only an accusative in -um and no nominative in -us is attested, but that's a matter of attestation and not of inflection.
Allen & Greenough (p. 24) have mythos only with mython while it's Delos with Delon (-um). Looking into L&S, accusative -Delum should be rarer, that is, it could properly belong to a classically unattested *Delus. Furthermore for mythos both A&G and L&S do not mentioned a mythum. That is, the template would generate to many forms. Which means, the accusative is -on and in case of Delos one could use |acc_sg=Dēlon/Dēlum to add further forms. But having an entry Delus (as already mentioned in Delos) makes more sense, and de.pons.com/%C3%BCbersetzung?q=Delos&l=dela&in=&lf= for example mentions Delus too.
As an addition: In dialectos it was "|acc_sg=dialecton" to remove a dialectum which the module would have created. But that's not the good way: One shouldn't have to remove forms (except from some rare exceptions) but one should have to add additional forms like a second genitive or here a second accusative if attested and belonging to the word. -84.161.56.213 02:25+02:37, 1 June 2017 (UTC)
4th and 5th declension fixes
[edit]So I've been meaning to point out several things that need fixing in the Latin declension/conjugation; why not start with the 4th declension. Weiss 2009, Outline of the Historical and Comparative Grammar of Latin page +-252 is the general reference; for quotations of Latin grammarians see the intro to TLL genū, page 1874-5.
- the neuter nom./acc. currently don't accept the macron-breve -ū̆ despite this ending being short either eventually or throughout;
- the currently optional and exclusive (sole form) -ubus should be made automatic and inclusive (alternative) with a note "Probably analogical, rare except in New Latin";
- there existed a number of different strategies for declining these in the genitive (the origianl -ūs, the analogical -uis advocated for by analogists like Caesar, as well as -ū) and the dative (-ū, -uī). That is, the -ū that is currently automatic only for the neuters was also an option for other genders, and when combined with the genitive in -ū the neuters ended in a graphic V in all cases, but as explained in
{{P:la:4decl-neut}}
the vowel length probably strictly differed in this case; - additionally, only the second-declension-borrowed genitive in -ī seems to be found in Plautus (with 1 exception) and is frequent in inscriptions.
All of these forms ought to be given for all nouns in my opinion, and not just the single forms stipulated by modern, sociolinguistically-blind school-grammars. —
- the 5th decl. genitive had 4 forms. The diēī (after vowel) vs reī (after consonant) rule applies only Classically; in Plautus and Terence monosyllabic -ei are everywhere; there also existed a probably archaising genitive in -ēs, and Gellius finds forms even in -ī (diī, famī) - perhaps analogical to the use in tribūnus plēbī. The dative in Plautus has diē, rē, fidē alongside monosyllabic -ei.
Do you'all think it's more appropriate to add these as well, with footnotes, than to create a usage template explaining that these forms were found and/or possible? The amount of forms in the 5th declension genitive in particular does seem like it'll hurt readability (especially since two or more short forms per line aren't currently possible) and confuse the reader. A possible solution could be giving the rarer forms with footnotes in a separate table - ideally with cells given only for these number-form combinations, as opposed to duplicating everything else. On adopting the use separate tables I'll write below. Mentioning @Benwing2 as being the template overlord - I can add the forms and notes myself, but a macron-breve solution is beyond my abilities. Brutal Russian (talk) 02:31, 7 June 2021 (UTC)
Alternative declension tables instead of alternative forms
[edit]As I mention in this reply to -sche, there's currently an issue of mixing alternative declension classes with alternative forms, which in the case of Greek borrowings turns into an issue of code-mixing on the part of the dictionary. Basically my proposal is to create separate, strictly Greek declension tables instead of the confusion seen at synthesis. Accordingly, we need to explain to the reader that using a Greek declension type goes hand in hand with using a Greek pronunciation, and is basically an instance of code-switching. The nom. plural -eis vs. -īs is also purely graphical in both Greek and Latin (after ~150 BCE), spelling the same vowel /ī/, and it's not clear why this is currently given for the nominative only.
This directly relates to what an IP writes at the top of this discussion page, as -os/-us and -on/-um are directly equivalent in Greek and Latin; the first one was probably even pronounced nigh-indentically. This creates a bit of a problem with those words, like dialectus, because in truth it's the same word adapted and unadapted - by the way, the whole apparently established declension of dialectos is peculiar and loos dubious. It might be a graphic device to distinguish 2nd declension Greek-derived feminines, since the spelling dialectus would be likely to represent a nativised masculine variant (c.f. the masculine in all of Romance); but most of its attestations in TLL are in the nominative, and not a single genitive dialectī is quoted there. I would not be surprised if the gentitive corresponding to the -os variant was really the unadapted dialectov.
Apart from Greek, words like vultus and vīrus could use separate tables. I've opted for this when editing the former entry; meanwhile VABritto created a whole declension class just for that one word in the module, which doesn't seem optimal next to my solution. I would like to apply it to vīrus as well.
Finally, would it be possible and/or desirable to give two plural columns inside one table specifically for cases like these, with a title like "Fourth-declension noun with a second-declension variant in the plural only"? If implemented, alternative forms for the 5th declension could be given in the same manner without having to encumber the page with separate tables. And this could be used where the 4th declension has 3d declension variants, like genoris from genu. Brutal Russian (talk) 03:01, 7 June 2021 (UTC)
Leveraging existing Latin templates and modules for new page or data forms
[edit]I have a development background but not in Wikimedia. I have long searched for a leverageable word list for use in drilling Latin vocabulary and declensions but have not been successful in finding a list that meets the requirements, those being: identifiable nominative form, with macrons where appropriate, including gender, plurality, declension number/pattern, stem, alternative stem if not discernable from nominative, and perhaps a hundred other considerations, maybe even a thousand. The closest dataset I find for that is either Lewis and Short or the data included in or behind the Latin language pages on Wiktionary, such as [[1]]. The Lewis and Short dataset I found from the Perseus project would be extremely complicated to parse since dictionaries are written for human consumption and interpretation.
I know that all of you who have created and supported these Latin pages, templates, and modules, have put a lot of work into them and I thank you for that. I hope you don't mind, and hope some of you might participate and help, that I would like to leverage it to make datasets of computer readable data. In other words, to extract large amounts of the Latin vocabulary into tables that can feed a program to help me in Latin drills. My interest is primarily in doing this for my own benefit but my work would be freely available as pure open source for any other non-commercial use.
What I'd like to do is figure out how to extract a list of all those noun lemmas from the category pages and use the templates/modules to turn HTML page data into table-form vocabulary lists. Since my interest is primarily in creating the drill/practice software, I'm hoping to avoid duplicating what appears to be years of effort in developing those objects with some help from someone who understands them already - things like is there underlying data or did 30,000 or more Latin word pages actually get created manually, one at a time?
Any suggestions how I might go about finding that kind of help? It's not my intention to spam this request but I might post it on one or two of the Latin word pages as well, in the hopes of getting it in front of those actively supporting those Latin word pages. Diprestonus (talk) 10:50, 4 September 2023 (UTC)