Wiktionary talk:About Kashubian
Add topicThings to take care of
[edit]- verbs conjugations - four present tense types, past is mostly analytic with gendered past forms
- nouns - declensions (what if the suffixal pronouns were part of the declension table?)
- adjectives - declensions (looks easily automatable)
- adverbs - done?
- pronouns declension tables - Will we be dealing with the suffixal pronouns (-ów, bratów, bratow-égò, -in, Anin, Anin-égò), should these be part of the noun declension table? Or perhaps a different table, like Hungarian's possessive tables?
- numerals - declension
- prepositions - shouldn't really need anything special
- conjunctions - shouldn't really need anything special
- interjections - shouldn't really need anything special
- affixes - shouldn't need anything special
Vininn126 (talk) 17:44, 2 January 2022 (UTC)
Palatalization
[edit]@Thadh, @Luxtaythe2nd How are we gonna handle palatalized consonants? with /Cʲ/ or /Cj/? Vininn126 (talk) 07:32, 3 January 2022 (UTC)
- I'd prefer j. Palatalised vowels are often spelled more 'modestly' and are not fully spelled /j/, sometimes even as far as /i/ in some dialects. Luxtaythe2nd (talk) 07:39, 3 January 2022 (UTC)
- @Luxtaythe2nd: I think the question is: how often does one actually palatalise the preceding consonant (so, double articulation occurs) rather than just stick a glide after it (compare Russian and Polish). AFAICT, the latter is more often the case. Thadh (talk) 10:43, 3 January 2022 (UTC)
- It might be that it's both, /Cʲj/. If it is, we might wanna do like we do on Polish and just mark the /j/. Vininn126 (talk) 10:51, 3 January 2022 (UTC)
Derived and Related terms
[edit]@Thadh@Luxtaythe2nd What sort of formatting is gonna be preferred for Kashubian? col4, der boxes, or what? Vininn126 (talk) 18:01, 4 January 2022 (UTC)
Declension Modules
[edit]@Benwing2 I know you're probably flooded with tasks (between Catalan and also the Polish pronunciation module!), but could I ask for your help making Kashubian declension modules? I recently wrote up w:Kashubian grammar. I don't believe everything exists there, namely a few alternations, such as in adjectives where we have a situation like brzëdczi/brzëdkô but człowieczi/człowieczô (where there is no k/cz alternation). If not I can try my hand but I'm not the best programmer. Vininn126 (talk) 11:24, 3 February 2024 (UTC)
- @Vininn126 I'll take a look. It's probably possible to leverage the Czech declension module I wrote. I'm basically done with the Catalan stuff and finishing up some changes to Module:en-headword. Haven't forgotten about the Polish pronunciation module :) Benwing2 (talk) 21:34, 3 February 2024 (UTC)
- @Benwing2 Yeah, and there should be some common features such as the same alternations (but I'm not sure if we want a common module...) Vininn126 (talk) 21:37, 3 February 2024 (UTC)
- @Vininn126 There's already Module:inflection utilities and Module:parse utilities that are used by (almost) all the inflection modules I've written, so I don't think we need to share the Czech and Kashubian modules themselves given that there are surely a zillion little differences. Benwing2 (talk) 21:46, 3 February 2024 (UTC)
- @Benwing2 By common I meant something like MOD:cs-common but for Kashubian. Not as in shared. Vininn126 (talk) 23:42, 3 February 2024 (UTC)
- @Vininn126 I see. That module exists mostly because of stuff shared among nouns, verbs and adjectives (although the verb module still isn't finished ...). Benwing2 (talk) 01:07, 4 February 2024 (UTC)
- @Benwing2 Yeah, I was looking - Kashubian has some regular vowel/consonant alternations but aside from that I'm not sure there's much else. Vininn126 (talk) 09:02, 4 February 2024 (UTC)
- @Vininn126 Is there a website containing inflected Kashubian nouns, verbs, adjectives? That would help a lot in developing the modules. If not that, is there an online Kashubian dictionary? I notice there's a grammar of Kashubian you've linked (in Kashubian, ugh ... Kashubian is not one of the supported languages in Google Translate), but there are usually tons of special cases where you need a good inflecting tool to know how to inflect them properly. Benwing2 (talk) 09:16, 4 February 2024 (UTC)
- BTW I am going to start with adjectives because they are the easiest. I notice however that one of the statements on the linked Wikipedia page is that Kashubian has a lot of short adjectives; do you have a reference on how they work and/or a list of the most common ones? Benwing2 (talk) 09:17, 4 February 2024 (UTC)
- @Benwing2 sloworz.org exists but it has mixed quality and very few adjectival declensions - we have a lot of nouns with inflections already. It's somewhat easier than Polish - feminine genitive plural always as -ów (and we would give a zero inflection manually), and ki/gi goes to czi/dżi, and I'm not sure if it's in w:Kashubian language, but ko/go/mo/wo and ku/gu/mu/wu go to kò/gò/mò/wò and kù/gù/mù/wù. You also see vowel alternations which I described in Kashubian language.
- In the case of ki/czi - we would need a way to turn off this alternation as there are some adjectives that keep -cz- in the stem throughout, compare głãbòczi/głąbòką vs człowieczi/człowieczą.
- As for adjectives and my article - I used Treder as a base and he included much more dialectal information. The Kashubian grammar I linked describes a standard. I can explain anything, as I can make out Kashubian when reading (and tables transcend most barriers).
- For short form adjectives I think these should be supplied manually, but they're formed much the same way as Polish - the final vowel is removed and if the final consonant is voiced, you see "lengthening" (i.e. the vowel alternations I mentioned). Vininn126 (talk) 09:24, 4 February 2024 (UTC)
- @Vininn126 Do the short-form adjectives exist only in the masculine nominative singular or (like Czech) do they exist in other genders and numbers? Benwing2 (talk) 09:51, 4 February 2024 (UTC)
- @Benwing2 Makùrôt says they are indeclinable, along with some German-origin adjectives. (page 72) (This reminds me, could we add a parameter
indecl=1
to{{pl-adj}}
?) Vininn126 (talk) 09:54, 4 February 2024 (UTC)- Also there might some particular verb/noun forms with alternations outside of the ones I mentioned, I'll see if I can't find out what conditions them. Vininn126 (talk) 13:47, 4 February 2024 (UTC)
- @Vininn126 I am trying to implement an adjective declension module. The first thing I notice is that the Wikipedia page you write isn't really accurate about hard adjectives (or at least is confusing). You give młodi as an example of a hard adjective but p. 73 of Makùrôt gives bëlny with -y. The next page gives mòdri as an example of a hard adjective with -i but doesn't explain why some hard adjectives have -y and some have -i. Even more important is how to distinguish hard from soft adjectives by the lemma; there's a comment at the top of p. 73 that maybe says soft adjectives end in ń, cz, dż, sz, or ż, but I'm not sure. Also your Wikipedia table lists four possible instrumental plural endings for hard adjectives, and two possible endings for genitive plural, locative plural and masculine personal accusative plural (what about dative plural, should that also have two endings)? What is the difference in usage between these endings? Benwing2 (talk) 02:38, 5 February 2024 (UTC)
- I also have a question about the possessive declensions in -ów (p. 78) and -in (p. 79). There are two masculine accusative forms listed; I take it the first is inanimate, the second is animate. But there are also two forms listed for neuter nom/acc/voc. Are these simply alternants, or is there a meaning difference? Also what is the difference between endings like -owa, -owò and -òwa, -òwò? Are these dialectal differences? Finally, the text on p. 79 says this:
- W òtmianie dosebnëch znankòwnikach wszëtczich ôrtów zakùńczonëch na -ów, -owa (-òwa), -owò (-òwò), jeżlë sufiksë te nachôdają sã pò til- nojãzëkòwëch spółzwãkach k, g, h abò pò lëpnëch spółzwãkach p, b, f, w, m, zachôdô labializacjô.
- W òtmianie dosebnëch znankòwnikach wszëtczich ôrtów zakùńczonëch na -in, -ina, -ino, jeżlë sufiksë te nachôdają sã pò spółzwãkach s, z, c, dz, n, pò tëch spółzwãkach pisze sã -y (nié: -i);
- W òtmianie dosebnëch znankòwnikach zakùńczonëch na -in, -ina, -ino mòże dochadac do zmianë fòneticzny pòstacë sufiksów, tj. pòjôwiają sã fòr- më sufiksów -ëna, -ëno (fòneticzny zmianie nie pòdlégô leno sufiks -in).
- What does this mean, esp. the references to different sorts of ending consonants, e.g. k/g/h, p/b/f/w/m, s/z/c/dz/n? Thanks! Benwing2 (talk) 08:11, 5 February 2024 (UTC)
- @Benwing2 -y is used only afer c, dz, s, and z (but not cz) and n to prevent a "soft" reading (i.e. not ći, dźi, ńi, etc), as <y> is read the same as <i>. -imi is the preferred instrumental plural ending.
- -o- vs -ó- is the result of the alternation I mentioned above, i.e. occurring after the mentioned letters (labials and velars). So you can probably set up a table of regular endings and then give rsubs, potentially, unless you think that's a bad way of doing it. Vininn126 (talk) 09:39, 5 February 2024 (UTC)
- Btw I'm trying to figure out certain alternations - you have for kazac: kôżã/kôzôł. Gołąbk's Polish-Kashubian dictionary gives forms but it can be frustrating because it's Polish-Kashubian not Kashubian-Polish. A general rule is that infinitives and imperatives will have a/e/o/ë, and present and past forms will have ô/é/ó/u (of course exceptions yada yada, but that should be the default).
- I think most noun vowel alternations are similar to Polish... Vininn126 (talk) 10:07, 5 February 2024 (UTC)
- I also realized I forgot to mention that possessive adjectives do have a different declension, sorry about that... Added them to 'pedia, at least. Vininn126 (talk) 10:12, 5 February 2024 (UTC)
- @Vininn126 see User:Benwing2/test-csb-adecl for the implementation so far. As for kazac having kôżã, if this is the pres 1sg and you're referring to the consonant alternation, then this is similar to class 6 of Russian verbs, e.g. сказа́ть (skazátʹ); the last consonant iotates in the present tense. If however you're referring to the vowel alternation, then I'm not so sure but I imagine it has to do with stress movement at an earlier stage; cf. the aforementioned Russian verb, which is class 6c, where "c" means it is has a mobile accent with end stress in the infinitive, present 1sg and outside the present tense, and root stress otherwise in the present. (The particular way that the accent moves in Russian may not reflect Proto-Slavic. In Bulgarian, for example, the accent movement in mobile-accent verbs is I think opposite, with root stress in pres 1sg and end stress elsewhere in the present.) Benwing2 (talk) 02:47, 6 February 2024 (UTC)
- I have added support for dwa. Please check the endings and let me know if they need tweaking (and if there are multiple endings please let me know what the difference is so I can footnote the endings appropriately). Thanks, Benwing2 (talk) 08:00, 6 February 2024 (UTC)
- I believe all these forms so far are correct, but I'll run it by some natives with linguistic knowledge that I've been talking to. How do you feel about using virile/non-virile instead of masculine plural and non-masculine plural, as that's part of Kashubian grammar like in Polish. I also wonder if it's better to put plural at the bottom or on the right. I'm referring to the vowel altneration with that form. Vininn126 (talk) 08:25, 6 February 2024 (UTC)
- @Vininn126 Personally I think Polish should adopt 'personal masculine plural' and 'non-personal plural' in place of 'virile' and 'non-virile', because (a) it's more explanatory (how are people going to know that "virile" means "personal masculine plural"?), and (b) I'd like to eliminate "virile" and "non-virile" because they're ad-hoc genders created only for Polish. As for where to put the plural (bottom or right), it's currently on the bottom because that's how the Czech tables do it, but I have no problem putting it on the right; probably makes more sense that way since there are only two columns (vs. 4 for Czech, which line up with the singular coluns). Benwing2 (talk) 08:32, 6 February 2024 (UTC)
- @Benwing2 While it is something used mostly about Lechitic languages, it is used in literature and we have an entry for it as well. We do include some terminology specific to some languages. Vininn126 (talk) 08:36, 6 February 2024 (UTC)
- I could perhaps be swayed, but I'd like to check some various other grammars to see how they talk about it. Vininn126 (talk) 08:48, 6 February 2024 (UTC)
- OK, that's fine with me. Benwing2 (talk) 08:49, 6 February 2024 (UTC)
- If you look at Module:gender and number/data you'll see that only "virility" is truly language-specific. I don't like it for a lot of reasons, not just the language-specificity. An additional issue is that it is intersecting two categories (gender and animacy) and creating a third ad-hoc one ("virility"), which seems wrong to me; all other intersectional categories are specified just by listing the components that go into the intersection. I do see that it is used in some of the literature but IMO that doesn't mean we need to adopt it given its liabilities. Benwing2 (talk) 08:48, 6 February 2024 (UTC)
- @Benwing2 I suppose both your proposal and virile/non-virile are used, and some people have complained about the use of virile before (though I'm sure once we change some other people will complain as well). This all comes back to wanting to revamp Polish declension on the whole. If you want to change the headwords/categorization, I suppose there'd be some sense in it (by the way, any word on giving the (categorizing) indeclinable parameter to
{{pl-adj}}
?) Vininn126 (talk) 08:57, 6 February 2024 (UTC)- (BTW the as of recently split off Pannonian Rusyn also has the virile/nonvirile distinction adopted, following the pattern in Polish and other Lechitic languages) Thadh (talk) 10:43, 6 February 2024 (UTC)
- Yes, if we make this change, we'd have to change a lot of West Slavic languages. Is that something you have an opinion on, @Thadh Vininn126 (talk) 10:45, 6 February 2024 (UTC)
- Not really but I have to add that "non-personal masculine plural" is a) a mouthful and b) probably often incorrectly parsed (it's non-[personal masc. pl.], not [non-personal] [masc. pl.]). Thadh (talk) 10:50, 6 February 2024 (UTC)
- Yes, if we make this change, we'd have to change a lot of West Slavic languages. Is that something you have an opinion on, @Thadh Vininn126 (talk) 10:45, 6 February 2024 (UTC)
- (BTW the as of recently split off Pannonian Rusyn also has the virile/nonvirile distinction adopted, following the pattern in Polish and other Lechitic languages) Thadh (talk) 10:43, 6 February 2024 (UTC)
- @Benwing2 I suppose both your proposal and virile/non-virile are used, and some people have complained about the use of virile before (though I'm sure once we change some other people will complain as well). This all comes back to wanting to revamp Polish declension on the whole. If you want to change the headwords/categorization, I suppose there'd be some sense in it (by the way, any word on giving the (categorizing) indeclinable parameter to
- I could perhaps be swayed, but I'd like to check some various other grammars to see how they talk about it. Vininn126 (talk) 08:48, 6 February 2024 (UTC)
- @Benwing2: I get your point about virile being the only thing Polish-specific in the module but… some grammatical categories are language-specific (maybe we should consider having more language-specific terminology in the modules than we do now, then).
- I’d be generally against using non-personal plural (as it doesn’t suggest that eg. feminine personal plurals fall under this form), or the direct translation from Polish, ie. non-masculine personal plural as it can be parsed many different ways (while Polish niemęskoosobowy is fairly clear it’s anything that either isn’t masculine or isn’t personal). Here non-virile is self-explanatory if you understand virile. // Silmeth @talk 22:00, 6 February 2024 (UTC)
- @Silmethule @Thadh I have used "other than masculine personal" in the headers, which I think works fairly well, see User:Benwing2/test-csb-adecl for examples. The problem I see is not only that virile is Polish-specific but that it's not very clear to anyone who doesn't already know Polish grammar, while masculine personal is self-explanatory. Essentially we're introducing additional terminology, which will be opaque to the average reader, simply to make the headers shorter. (An additional issue is that sometimes "virile" seems to imply plural, which I find especially problematic, and sometimes it doesn't.) Benwing2 (talk) 22:24, 6 February 2024 (UTC)
- @Benwing2 While it is something used mostly about Lechitic languages, it is used in literature and we have an entry for it as well. We do include some terminology specific to some languages. Vininn126 (talk) 08:36, 6 February 2024 (UTC)
- @Vininn126 Personally I think Polish should adopt 'personal masculine plural' and 'non-personal plural' in place of 'virile' and 'non-virile', because (a) it's more explanatory (how are people going to know that "virile" means "personal masculine plural"?), and (b) I'd like to eliminate "virile" and "non-virile" because they're ad-hoc genders created only for Polish. As for where to put the plural (bottom or right), it's currently on the bottom because that's how the Czech tables do it, but I have no problem putting it on the right; probably makes more sense that way since there are only two columns (vs. 4 for Czech, which line up with the singular coluns). Benwing2 (talk) 08:32, 6 February 2024 (UTC)
- @Benwing2 As for the errors I'm not really sure forms like -enże exist in Kashubian, I can't find anything for tenże. Otherwise ten should just start with t- and take adjectival declension from there. Vininn126 (talk) 09:05, 7 February 2024 (UTC)
- @Vininn126 Yeah those errors are just placeholders, unmodified from the Czech test page. Benwing2 (talk) 09:32, 7 February 2024 (UTC)
- @Benwing2 Well if that's the case, adjectives might be ready already aside from the pronoun. Are possessive pronouns also handled? Vininn126 (talk) 09:38, 7 February 2024 (UTC)
- @Vininn126 Do you mean mój, twój, swój, naj/naji/nasz, waj/waji/wasz and Wasz? They aren't handled yet, nor jeden nor niżóden. BTW what does niżóden mean and are there others like these pronouns/numerals? I seem mention of taczi, chtëren, jaczi but I don't know what these mean nor how they are declined. It shouldn't be too hard to add support for these terms; it's just a case of adding some special cases so that they work with
<irreg>
. I also see things in Sloworz like jaczikòlwiek and jacziż that appear to be declined as jaczi + fixedkòlwiek
orż
; these can be handled using e.g.{{csb-adecl|jaczi<irreg>kòlwiek}}
and{{csb-adecl|jaczi<irreg>ż}}
once I implement support for jaczi. - Something of note is the noun jaczel [1], which declines with a stem jakl-. Here we have not only a mobile e but some sort of palatalization before the mobile e, which would need to be handled. Is there a special Kashubian palatalization that occurs post-Proto-Slavic, and if so how does it work?
- One more thing is with jãczmiéń [2] to take a random example, which declines with stem jãczmieni-. This has both a spelling alternation ń ~ ni and a vowel alternation é ~ e. I think you mention the latter down below, and for the former there is a close analogy with Czech d t n (occurring before i or ě) and ď ť ň (occurring anywhere else). To handle these alternations sanely, the underlying stem is always of the ď ť ň form, and changes like ňe -> ně happen at the very end of declension generation. Here there would be something analogous, where underlying stems use ń ś ź ć dź (did I miss anything?), which is converted to ni si zi ci dzi as needed at the end of declension generation. Benwing2 (talk) 10:05, 7 February 2024 (UTC)
- @Benwing2 I do. niżóden is the same lexeme as Polish żaden ("no" "none"). We have entries for the others, they are "such", "which", "what" respectively and taczi and jaczi have -k- stems.
- As for jaczel; I believe I wrote about this palatalization in w:Kashubian language in features, where ki/gi and ke/ge palatalize, so I wonder if the best solution would simply be to provide the stem? There are many other such examples, and I believe in some rare cases compare articzël. Alo Nôdżel - Nagiel, cerczew - cerkwiô marchew (obsolete marchiew) - marchwiô (the latter two are examples of feminine nouns with mobile e and old soft final -w, like in Polish and Russian).
- You have ń/n, there is no ś/ź - s/z in Kashubian, palatal sibilants and affricates merged with their dental counterparts. You should also have something like ł/l. Vininn126 (talk) 10:17, 7 February 2024 (UTC)
- Finally, can you say something about velar stems like anczerk [3] and how their declensions modify? I see for example that they tend to take vocatives in -ù rather than in -e (same as in Czech); there may be other differences as well. In addition, there is definitely some weird palatalization going on, e.g. abstinent [4] has locative and vocative singular abstinence where the -t gets palatalized to -c before -e (also before -ë in the nom plural), something I've not seen before and something I'll need to handle. Benwing2 (talk) 10:17, 7 February 2024 (UTC)
- @Benwing2 As you mentioned below from my article, velars usually take -ù on the whole in the locative, and the vocative is usually the same as the locative for masculine. I think some might not, but by default it should be -ù after -k/g/(c)h-.
- As for this, this is the so-called "softening" -e, so what from Proto-Slavic you have -te -> Lechitic -cie -> Kashubian -ce (which is the merger I mentioned in my above message). So you get t/c alternation and d/dz alternation (as opposed to Polish t/ć(ci) and d/dź (dzi). Vininn126 (talk) 10:22, 7 February 2024 (UTC)
- @Benwing2 -u is also used after z, s, dz, c, sz, ż, dż in masculine declensions. For vocative, it can match either locative or nominative. Also having checked, the situation with genitive singular for masculine nouns is similar to Polish, and also a similar situation with nominative plural endings. Vininn126 (talk) 13:53, 7 February 2024 (UTC)
- @Vininn126 OK. I don't know the specific situation in Polish with genitive singular and nominative plural endings, can you fill me in or point me to the appropriate resources? Benwing2 (talk) 21:09, 7 February 2024 (UTC)
- @Benwing2 I think our Polish noun module should have some clues - i.e. I think -owie is used more after r for example, and -e after rz, etc. I'm not sure about genitive singular as much - sometimes it defaults to -u or -a, and I'm still not sure what the conditions provided are. In Polish it can be somewhat unpredictable, as words ending with -(e)k can take either, for example. I'm not sure if we shouldn't just give -u as the default supplying -a if necessary for inanimate... Vininn126 (talk) 21:21, 7 February 2024 (UTC)
- @Benwing2 Also as to the vowel alternations before voiced consonants that I mentioned - I can't recall if I said there should be a way to turn that off somehow. Perhaps that's already taken care of. Vininn126 (talk) 21:23, 7 February 2024 (UTC)
- @Vininn126 Yeah there will be a way of turning off the vowel alternations. In general I think the defaults should be simpler unless it saves a lot of manual keying. I'll take a look at the Polish noun module. Benwing2 (talk) 21:53, 7 February 2024 (UTC)
- @Benwing2 How would you simplify the defaults? And I think it will save a lot of typing - there are more cases that I've described as opposed to fewer, if that makes sense. Vininn126 (talk) 21:54, 7 February 2024 (UTC)
- @Vininn126 What I mean is, e.g. defauting to -a for animate but -u for inanimate or something rather than having more complex defaults for inanimate. The indicator for overriding the genitive singular would look like
gena
to specify that the genitive singular ends in -a so it's only a few keystrokes to specify it. OTOH the defaults for plurals in Czech are somewhat involved and work in most circumstances. Benwing2 (talk) 21:59, 7 February 2024 (UTC)- BTW I'm not sure what you mean by "there are more cases that I've described as opposed to fewer". Benwing2 (talk) 22:00, 7 February 2024 (UTC)
- @Benwing2 Sorry, I see what you mean now! Well, I think if we look at the Polish module we can see where -a is given for inanimate, and I can confer and confirm. Vininn126 (talk) 22:01, 7 February 2024 (UTC)
- @Vininn126 What I mean is, e.g. defauting to -a for animate but -u for inanimate or something rather than having more complex defaults for inanimate. The indicator for overriding the genitive singular would look like
- @Benwing2 How would you simplify the defaults? And I think it will save a lot of typing - there are more cases that I've described as opposed to fewer, if that makes sense. Vininn126 (talk) 21:54, 7 February 2024 (UTC)
- @Vininn126 Yeah there will be a way of turning off the vowel alternations. In general I think the defaults should be simpler unless it saves a lot of manual keying. I'll take a look at the Polish noun module. Benwing2 (talk) 21:53, 7 February 2024 (UTC)
- @Benwing2 Also as to the vowel alternations before voiced consonants that I mentioned - I can't recall if I said there should be a way to turn that off somehow. Perhaps that's already taken care of. Vininn126 (talk) 21:23, 7 February 2024 (UTC)
- @Benwing2 I think our Polish noun module should have some clues - i.e. I think -owie is used more after r for example, and -e after rz, etc. I'm not sure about genitive singular as much - sometimes it defaults to -u or -a, and I'm still not sure what the conditions provided are. In Polish it can be somewhat unpredictable, as words ending with -(e)k can take either, for example. I'm not sure if we shouldn't just give -u as the default supplying -a if necessary for inanimate... Vininn126 (talk) 21:21, 7 February 2024 (UTC)
- @Vininn126 OK. I don't know the specific situation in Polish with genitive singular and nominative plural endings, can you fill me in or point me to the appropriate resources? Benwing2 (talk) 21:09, 7 February 2024 (UTC)
- @Benwing2 -u is also used after z, s, dz, c, sz, ż, dż in masculine declensions. For vocative, it can match either locative or nominative. Also having checked, the situation with genitive singular for masculine nouns is similar to Polish, and also a similar situation with nominative plural endings. Vininn126 (talk) 13:53, 7 February 2024 (UTC)
- @Vininn126 I see, thanks. Can you give me a full declension for jaczi and taczi (or at least indicate what declensions they follow and how the stem alternations jacz- vs. jak- work)? Benwing2 (talk) 10:26, 7 February 2024 (UTC)
- @Benwing2 They take the long declensions and show the same alternations as głãbòczi/głãbòką (i.e. in the nominative masc. and similar forms as it's before -i). Vininn126 (talk) 10:29, 7 February 2024 (UTC)
- @Vininn126 I have added the various missing irregular terms to Module:User:Benwing2/csb-adjective. Benwing2 (talk) 01:36, 8 February 2024 (UTC)
- @Vininn126 Can you help me understand the alternations on p. 66 of Makùrôt's grammar? We have dôka with gen dôczi and dat dôce and wiérzta with gen wiérztë and dat wiérzce. I am guessing that the dative of dôka is due to the Slavic second palatalization but the others are due to a later Kashubian-specific "softening" you mentioned above. If this is the case, then this softening presumably occurs before e and i but not ë, and turns k -> cz, g -> ż (or dż?), t -> c, d -> dz, n -> ń, r -> rz, is that right? I say "Kashubian-specific" because AFAIK the Polish softening that produced ń, ś, ź, ć, dź, rz operated before Proto-Slavic e, ě, ẽ, i and ь, but this Kashubian softening seems a later process that operated after the merger of Proto-Slavic y and i (hence the gen dôczi before Proto-Slavic -y). Actually, reading the Wikipedia page on Kashubian language, it seems maybe that the Polish-type softening did occur, followed by a separate process that palatalized k and g before newly developed front vowels. Benwing2 (talk) 03:06, 8 February 2024 (UTC)
- @Benwing2 The new test cases look good!
- dôka -> dôczi is the Kashubian palatalization, and the shift of k/g->c/dz is indeed Slavic. Your understanding is absolutely right (I will clarify that it's g->dż. Is the ł/l alternation not part of this? Kashubian palatalization of k/g before i is a later process that didn't affect Polish, but Polish type softening did affect Kashubian (I believe it might even be cross-Lechitic), but then Kashubian changed how the palatal sibilants and affricates work, hardening them. Vininn126 (talk) 08:26, 8 February 2024 (UTC)
- @Vininn126 I forgot about the ł/l alternation. It seems labials also alternate, e.g. w/wi, m/mi, p/pi, b/bi, f/fi. Also ch/sz? WARNING: The more I look into this the more complex it seems. I will need a lot of help from you translating parts of the Makùrôt document, as I don't speak Kashubian and it's daunting trying to make sense of it. A part I need translating, for example, is p. 68 (and the first two lines of p. 69) that is the commentary on feminine nouns, as I'm trying to work on them. Also do you know if it's possible to search in sloworz.org for all words ending in a particular ending? I'm trying to find example declensions of different feminine nouns whose stem ends in various different consonants, and it's difficult just looking through the contents as a lot of the words have no declension and it's not possible (AFAICT) to determine which ones have declensions until I click through to them. Benwing2 (talk) 08:44, 8 February 2024 (UTC)
- It might make more sense to rewrite the Polish noun module first according to the way I normally write modules; after that the Kashubian stuff might make more sense. Not sure. Benwing2 (talk) 08:47, 8 February 2024 (UTC)
- @Benwing2 the alternations you mentioned occur mostly in the dative/locative ending -(i)e but also the nominative plural for virile nouns, that's right (but ch/sz is only in the nom pl except in rare cases).
- "Comments about Feminine nouns ending in a consonant"
- Synchronism in the accusative singular and nominative singular with a zero ending (Like Polish)
- Nominative, accusative, and vocative use -e or -ë (this will likely depend on the ending consonant)
- Genitive plural uses -ów, but sometimes -i/-y (I think this would be manual)
- For all genders dative plural always uses -óm, instrumental plural always -ama, and locative plural always -ach
- And as to rewriting the Polish module - yes, there might be some sense in that. Polish has a lot more materials, and both languages are Lechitic. What's more is that due to the (historically nationalistic) view that Kashubian is a dialect, very often Kashubian grammar is presented from a Polish point of view, with modifications. But that aside, there are still sound changes that affect alternations that might make sense to start from Polish - but also I worry about how complex Polish might be.
- With those last irregular words, does this mean the adjective module is ready? All the forms look correct to me. Vininn126 (talk) 08:54, 8 February 2024 (UTC)
- I also realized that there might be one more alternation between i/ã, so indeed working on a Polish module might be a good idea... Vininn126 (talk) 16:46, 8 February 2024 (UTC)
- @Vininn126 OK, let me start looking into the Polish modules. Thanks for the translation. I do think the adjective module is ready, modulo the issue with virile vs. masculine personal, which we can deal with later. Benwing2 (talk) 21:39, 8 February 2024 (UTC)
- @Benwing2 I think there's a preference for virile based on the discussion here. Perhaps in the future we can switch, but maybe sticking to the status quo for now will be best. If you agree, perhaps we could move the module and template to mainspace. Vininn126 (talk) 15:21, 9 February 2024 (UTC)
- @Vininn126 There's another issue I found with the virile/non-virile classification, which is that it doesn't work for things like dwa and òba, which have a four-way masculine personal, masculine non-personal, feminine and neuter distinction in the plural (or at least three ways; the masculine non-personal and neuter are syncretic). This is yet another reason why I think the virile/non-virile concept needs to be discarded. Another issue I see is that the Polish tables have 'virile' in the plural but 'masculine personal/animate' in the singular when in fact 'virile' = 'masculine personal'. To avoid inconsistencies and confusions like this I'd still rather stick with what I have at least for Kashubian; we can leave Polish as-is for now and try to change it later. What do you think? Benwing2 (talk) 20:43, 9 February 2024 (UTC)
- @Benwing2 dwa and oba are true outliers within Lechitic, resulting from the tradition of distinguishing the dual. I'm not sure we should consider them when it comes to terminology (but I do see what you mean). (Where do you see these terms used in Polish? It seems a bit odd to me). Vininn126 (talk) 20:48, 9 February 2024 (UTC)
- @Vininn126 See acetylenowy and other adjectives. Benwing2 (talk) 21:25, 9 February 2024 (UTC)
- @Benwing2 I see what you mean. I think that singular/plural is a major distinction in Polish. This is also the result of some terms being tantum singular vs tantum plurale. I very much understand where you're coming from. I think that there will be a discussion regarding these genders, but since all the languages mentioned so far are so entrenched in Polish linguistics and that there's a status quo, that it might make sense for now to continue that. I understand this might add another step to changing the status quo with all languages in this region, but I think that to continue in that direction we should recognize what is currently established. I myself am somewhat split on the issue, leaning towards keep virile, but I also understand the issue of keeping family-specific terms.
- As a result I see regularizing regionalisms as a stepping stone into a larger system of regularization. I hope this isn't too ramble-y. Vininn126 (talk) 21:41, 9 February 2024 (UTC)
- @Vininn126 How about as a compromise, for now we use virile (= masculine personal) and non-virile in the headers? That way at least we're not using an opaque word like "virile" without properly glossing it. Benwing2 (talk) 08:57, 10 February 2024 (UTC)
- @Benwing2 Sounds fine. Vininn126 (talk) 09:06, 10 February 2024 (UTC)
- @Vininn126 How about as a compromise, for now we use virile (= masculine personal) and non-virile in the headers? That way at least we're not using an opaque word like "virile" without properly glossing it. Benwing2 (talk) 08:57, 10 February 2024 (UTC)
- @Vininn126 See acetylenowy and other adjectives. Benwing2 (talk) 21:25, 9 February 2024 (UTC)
- @Benwing2 dwa and oba are true outliers within Lechitic, resulting from the tradition of distinguishing the dual. I'm not sure we should consider them when it comes to terminology (but I do see what you mean). (Where do you see these terms used in Polish? It seems a bit odd to me). Vininn126 (talk) 20:48, 9 February 2024 (UTC)
- @Vininn126 There's another issue I found with the virile/non-virile classification, which is that it doesn't work for things like dwa and òba, which have a four-way masculine personal, masculine non-personal, feminine and neuter distinction in the plural (or at least three ways; the masculine non-personal and neuter are syncretic). This is yet another reason why I think the virile/non-virile concept needs to be discarded. Another issue I see is that the Polish tables have 'virile' in the plural but 'masculine personal/animate' in the singular when in fact 'virile' = 'masculine personal'. To avoid inconsistencies and confusions like this I'd still rather stick with what I have at least for Kashubian; we can leave Polish as-is for now and try to change it later. What do you think? Benwing2 (talk) 20:43, 9 February 2024 (UTC)
- @Benwing2 I think there's a preference for virile based on the discussion here. Perhaps in the future we can switch, but maybe sticking to the status quo for now will be best. If you agree, perhaps we could move the module and template to mainspace. Vininn126 (talk) 15:21, 9 February 2024 (UTC)
- @Vininn126 OK, let me start looking into the Polish modules. Thanks for the translation. I do think the adjective module is ready, modulo the issue with virile vs. masculine personal, which we can deal with later. Benwing2 (talk) 21:39, 8 February 2024 (UTC)
- I also realized that there might be one more alternation between i/ã, so indeed working on a Polish module might be a good idea... Vininn126 (talk) 16:46, 8 February 2024 (UTC)
- It might make more sense to rewrite the Polish noun module first according to the way I normally write modules; after that the Kashubian stuff might make more sense. Not sure. Benwing2 (talk) 08:47, 8 February 2024 (UTC)
- @Vininn126 I forgot about the ł/l alternation. It seems labials also alternate, e.g. w/wi, m/mi, p/pi, b/bi, f/fi. Also ch/sz? WARNING: The more I look into this the more complex it seems. I will need a lot of help from you translating parts of the Makùrôt document, as I don't speak Kashubian and it's daunting trying to make sense of it. A part I need translating, for example, is p. 68 (and the first two lines of p. 69) that is the commentary on feminine nouns, as I'm trying to work on them. Also do you know if it's possible to search in sloworz.org for all words ending in a particular ending? I'm trying to find example declensions of different feminine nouns whose stem ends in various different consonants, and it's difficult just looking through the contents as a lot of the words have no declension and it's not possible (AFAICT) to determine which ones have declensions until I click through to them. Benwing2 (talk) 08:44, 8 February 2024 (UTC)
- @Vininn126 Can you help me understand the alternations on p. 66 of Makùrôt's grammar? We have dôka with gen dôczi and dat dôce and wiérzta with gen wiérztë and dat wiérzce. I am guessing that the dative of dôka is due to the Slavic second palatalization but the others are due to a later Kashubian-specific "softening" you mentioned above. If this is the case, then this softening presumably occurs before e and i but not ë, and turns k -> cz, g -> ż (or dż?), t -> c, d -> dz, n -> ń, r -> rz, is that right? I say "Kashubian-specific" because AFAIK the Polish softening that produced ń, ś, ź, ć, dź, rz operated before Proto-Slavic e, ě, ẽ, i and ь, but this Kashubian softening seems a later process that operated after the merger of Proto-Slavic y and i (hence the gen dôczi before Proto-Slavic -y). Actually, reading the Wikipedia page on Kashubian language, it seems maybe that the Polish-type softening did occur, followed by a separate process that palatalized k and g before newly developed front vowels. Benwing2 (talk) 03:06, 8 February 2024 (UTC)
- @Vininn126 I have added the various missing irregular terms to Module:User:Benwing2/csb-adjective. Benwing2 (talk) 01:36, 8 February 2024 (UTC)
- @Benwing2 They take the long declensions and show the same alternations as głãbòczi/głãbòką (i.e. in the nominative masc. and similar forms as it's before -i). Vininn126 (talk) 10:29, 7 February 2024 (UTC)
- @Vininn126 Do you mean mój, twój, swój, naj/naji/nasz, waj/waji/wasz and Wasz? They aren't handled yet, nor jeden nor niżóden. BTW what does niżóden mean and are there others like these pronouns/numerals? I seem mention of taczi, chtëren, jaczi but I don't know what these mean nor how they are declined. It shouldn't be too hard to add support for these terms; it's just a case of adding some special cases so that they work with
- @Benwing2 Well if that's the case, adjectives might be ready already aside from the pronoun. Are possessive pronouns also handled? Vininn126 (talk) 09:38, 7 February 2024 (UTC)
- @Vininn126 Yeah those errors are just placeholders, unmodified from the Czech test page. Benwing2 (talk) 09:32, 7 February 2024 (UTC)
- @Vininn126 see User:Benwing2/test-csb-adecl for the implementation so far. As for kazac having kôżã, if this is the pres 1sg and you're referring to the consonant alternation, then this is similar to class 6 of Russian verbs, e.g. сказа́ть (skazátʹ); the last consonant iotates in the present tense. If however you're referring to the vowel alternation, then I'm not so sure but I imagine it has to do with stress movement at an earlier stage; cf. the aforementioned Russian verb, which is class 6c, where "c" means it is has a mobile accent with end stress in the infinitive, present 1sg and outside the present tense, and root stress otherwise in the present. (The particular way that the accent moves in Russian may not reflect Proto-Slavic. In Bulgarian, for example, the accent movement in mobile-accent verbs is I think opposite, with root stress in pres 1sg and end stress elsewhere in the present.) Benwing2 (talk) 02:47, 6 February 2024 (UTC)
- I also have a question about the possessive declensions in -ów (p. 78) and -in (p. 79). There are two masculine accusative forms listed; I take it the first is inanimate, the second is animate. But there are also two forms listed for neuter nom/acc/voc. Are these simply alternants, or is there a meaning difference? Also what is the difference between endings like -owa, -owò and -òwa, -òwò? Are these dialectal differences? Finally, the text on p. 79 says this:
- @Vininn126 I am trying to implement an adjective declension module. The first thing I notice is that the Wikipedia page you write isn't really accurate about hard adjectives (or at least is confusing). You give młodi as an example of a hard adjective but p. 73 of Makùrôt gives bëlny with -y. The next page gives mòdri as an example of a hard adjective with -i but doesn't explain why some hard adjectives have -y and some have -i. Even more important is how to distinguish hard from soft adjectives by the lemma; there's a comment at the top of p. 73 that maybe says soft adjectives end in ń, cz, dż, sz, or ż, but I'm not sure. Also your Wikipedia table lists four possible instrumental plural endings for hard adjectives, and two possible endings for genitive plural, locative plural and masculine personal accusative plural (what about dative plural, should that also have two endings)? What is the difference in usage between these endings? Benwing2 (talk) 02:38, 5 February 2024 (UTC)
- Also there might some particular verb/noun forms with alternations outside of the ones I mentioned, I'll see if I can't find out what conditions them. Vininn126 (talk) 13:47, 4 February 2024 (UTC)
- @Benwing2 Makùrôt says they are indeclinable, along with some German-origin adjectives. (page 72) (This reminds me, could we add a parameter
- @Vininn126 Do the short-form adjectives exist only in the masculine nominative singular or (like Czech) do they exist in other genders and numbers? Benwing2 (talk) 09:51, 4 February 2024 (UTC)
- @Vininn126 Is there a website containing inflected Kashubian nouns, verbs, adjectives? That would help a lot in developing the modules. If not that, is there an online Kashubian dictionary? I notice there's a grammar of Kashubian you've linked (in Kashubian, ugh ... Kashubian is not one of the supported languages in Google Translate), but there are usually tons of special cases where you need a good inflecting tool to know how to inflect them properly. Benwing2 (talk) 09:16, 4 February 2024 (UTC)
- @Benwing2 Yeah, I was looking - Kashubian has some regular vowel/consonant alternations but aside from that I'm not sure there's much else. Vininn126 (talk) 09:02, 4 February 2024 (UTC)
- @Vininn126 I see. That module exists mostly because of stuff shared among nouns, verbs and adjectives (although the verb module still isn't finished ...). Benwing2 (talk) 01:07, 4 February 2024 (UTC)
- @Benwing2 By common I meant something like MOD:cs-common but for Kashubian. Not as in shared. Vininn126 (talk) 23:42, 3 February 2024 (UTC)
- @Vininn126 There's already Module:inflection utilities and Module:parse utilities that are used by (almost) all the inflection modules I've written, so I don't think we need to share the Czech and Kashubian modules themselves given that there are surely a zillion little differences. Benwing2 (talk) 21:46, 3 February 2024 (UTC)
- @Benwing2 Yeah, and there should be some common features such as the same alternations (but I'm not sure if we want a common module...) Vininn126 (talk) 21:37, 3 February 2024 (UTC)
Nouns
[edit]@Vininn126 I am looking into what is needed to create a noun declension module. I am going to start with the Czech module, but allow gender to be omitted since it seems largely predictable from the ending of the noun (it's mandatory in Czech because there are so many words with exceptional or ambiguous endings that auto-predicting the gender leads to lots of mistakes). In regards to the following text from the Wikipedia article (my comments are in red):
Comments about the singular:
- Masculine nouns that end in a voiced consonant show regular vowel alternations of: ô:a, ó:o, é:e, ą:ã, i/u:ë. Mobile e (e:∅) also appears in some stems.
- Can you give me some examples of nouns that have vowel alternations, and nouns with mobile e? How predictable are the vowel alternations, how predictable is the mobile e, and is the latter similar to Polish? In Czech, vowel alternation is unpredictable when it occurs and mobile e is semi-predictable. To deal with this, I have a symbol
#
to request vowel alternations and various defaults to decide whether to generate a mobile e; if this is wrong, it can be enabled using*
or disabled using-*
.
- Can you give me some examples of nouns that have vowel alternations, and nouns with mobile e? How predictable are the vowel alternations, how predictable is the mobile e, and is the latter similar to Polish? In Czech, vowel alternation is unpredictable when it occurs and mobile e is semi-predictable. To deal with this, I have a symbol
- Like in Polish, there is irregularity with genitive singular -u/-a, where animal/personal nouns always get -a, but inanimate nouns may get both. In northern dialects, -u may be replaced with -ë.
- Czech and Ukrainian also have irregular case endings, and Ukrainian in particular has a comparable issue with -u vs. -a in the genitive. To deal with this, there will be defaults as well as case overrides.
- Dative singular shows two endings, -ewi (for soft nouns)/-owi (for hard nouns) and -u. It has been suggested (BY LORENTZ gram 872) that personal/animal nouns have a preference for -ewi/-owi. Rarely an ending -owiu has been used by combining both endings (compare Masurian -oziu). An ending -ë (from a short /u/) exists in North-East Kashubia. Finally, the adjectival ending -omù (hard)/-emù (soft) is also used in the North-East.
- Same here except that I'm not sure how much dialectal info we should include. My instinct is to start with only the standard.
- The instrumental singular ending -ã is used in the North-West for stylistic reasons or for rhymes.
- As above concerning dialectal info.
- The locative singular ending -(i)e is for hard stems and -(i)u is for soft stems or nouns whose stems end with -k/-g/-ch, as well as -s/-z. An ending -ë (from a short /u/) exists in North-East Kashubia.
- As above concerning dialectal info.
- Masculine nouns ending in -a decline femininely in the singular and masculinely in the plural.
Comments about the plural:
- The nominative plural has multiple endings, including -owie, -ë, -e, -i.
- Can you give some rules of thumb (i.e. defaults) for when the different plural endings occur? We will need to have defaults in combination with case overrides.
- The dative plural ending -ama may occasionally be seen.
- Use case overrides if this is part of the standard.
- The instrumental plural endings -mi (without -a-') and -i are rare.
- Use case overrides if this is part of the standard.
- The locative plural ending -ech can be seen in some names of countries, but is falling out of use and being replaced by -ach.
- Use case overrides if this is part of the standard (or omit if it's falling out of use).
Benwing2 (talk) 05:21, 7 February 2024 (UTC)
- @Benwing2
- These alternations most frequently occur before word-final voiced consonants and usually appears in masculine singular nominative and a few others, such as -rz, so the example of brzég/brzegu I have in the article, but also lód/lodu, sôd/sadu, dąb/dãbù, lud/lëdu, and i/ë is more rarer, grzib/grzëba would be an example. I think with -i- alternations a better approach might be to not have it on by default, showing alternation manually as you suggest. Other vowel alternations would be good to have on by default. For mobile e, it's much more rare in Kashubian, but you have for example len/lnu, wesz/wsza, but notably you don't get it as muchwith -k- based diminutives (which won't have it in the nominative generally). Some number does exist with -ek, and they'd work like the examples I gave. You also get it less in the feminine/neuter genitive plural, but by default those should have -ów and a zero morpheme could be supplied manually.
- With -u/-a, I think we should assume -u for inanimate and -a for animate, and to refer with Gołąbk's dictionary for confirmation, and generally not give -ë. There might be cases where -a is used more by default, but I'm not sure if it's restricted by ending or instead lexically restricted.
- We should give -owi by default and be able to supply -u in special cases - otherwise -ewi is restricted and I agree we don't need to give those dialectal forms.
- -ã should actually be the default for all instrumental singular, and I should update that in the article.
- Agree, we shouldn't give -ë.
- The plural is like Polish and can be incredibly difficult to predict for masculine virile/personal nouns. Some regular patterns exist, like -ôrz takes -arzë. Another thing I don't discuss in the article is how -i/-y/-ë can cause regular consonant "softening", similar to the ones we see in Polish. We might want to take a look at the Polish module for some clues.
- Let's not give -ama for dative plural.
- We should give -ama by default, overriding when necessary.
- Let's give -ach by default. Vininn126 (talk) 08:39, 7 February 2024 (UTC)
Patterns
[edit]@Benwing2 BTW, I've finished what I can of Module:User:Vininn126/csb-noun-examples.lua and I'm having a native with some linguistics experience check it. There's a lot more leveling in forms, I'd say. Other than that I've been working on completely rewriting the Wikipedia rticle on Slovincian (it turns out accent is probably a misanalysis on Lorentz's part, so we might end up changing the headword template. Need to finish researching.) Vininn126 (talk) 07:31, 14 March 2024 (UTC)
- @Vininn126 Interesting. I am surprised to hear that there's no free accent whatsoever in Slovincian; I thought it was accepted wisdom that Slovincian (as well as adjacent Northern Kashubian dialects up until 1945 or so) had free and mobile accent. I would definitely believe though that Lorentz's claims of phonemic length and of multiple types of accents are wrong. Apologies that the work on Polish noun declensions has stalled a bit; there have been some RL issues recently (e.g. a death in the family) that are getting in the way and distracting me a bit, so I have been focusing on things requiring less thought. Should be getting back to Polish declensions soon. In the meantime I'll take a look at what you've done with Module:User:Vininn126/csb-noun-examples.lua; thank you very much for creating it and getting it cross-checked with a native speaker. Benwing2 (talk) 07:39, 14 March 2024 (UTC)
- @Benwing2 Sorry, what I meant to say was there is no tone, there is definitely mobile accent. And I'm sorry to hear that. Vininn126 (talk) 07:40, 14 March 2024 (UTC)