Module talk:sa-utilities

Internal sandhi

Latest comment: 3 years ago11 comments4 people in discussion

@JohnC5 User:SodhakSH is interested in extending {{sa-decl-noun-m}} and similar to work with consonant-stem nouns. It looks like internal_sandhi needs some work to support them, e.g. for nouns in -c it doesn't properly convert c to g before bh and to k before ṣ. Since you created this code, do you know what work remains and approximately how to do it? Thanks! Benwing2 (talk) 21:13, 2 May 2021 (UTC)Reply

User:SodhakSH Also do you know the rules for accent movement in Vedic? I am looking at Wikipedia's entry on Vedic Sanskrit grammar and it appears the accent moves around in different ways for different nouns, but I don't have a good reference as to all the different classes/types of accent movement. Benwing2 (talk) 21:17, 2 May 2021 (UTC)Reply

@Benwing2: No, I'm not that knowledgeable about Vedic accents and stuff. When I add manual declension, though, I keep the accent on the same vowel unless Monier Williams mentions something different. Like क्षुभ्#Declension: I kept the accent on the same u except for the instrumental singular where MW had mentioned specifically kṣubhā́. 🔥शब्दशोधक🔥 02:41, 3 May 2021 (UTC)Reply

@SodhakSH BTW you should not create multiple categories for things like id-stems, uc-stems, etc. All of them are simply consonant stems. Benwing2 (talk) 02:59, 3 May 2021 (UTC)Reply

@Benwing2: I'll detail put the accent rules soon. And yes, the id-stems ubh- stems etc are simply consonant stems. -- 𝓑𝓱𝓪𝓰𝓪 𝓭𝓪𝓽𝓽𝓪^{(𝓽𝓪𝓵𝓴)} 03:17, 3 May 2021 (UTC)Reply

Same should be for category:Sanskrit_as-stem_nouns and category:Sanskrit_is-stem_nouns, then. 🔥शब्दशोधक🔥 03:24, 3 May 2021 (UTC)Reply

@SodhakSH: No, the s stem nouns behave differently depending on gender, plus their accents also behave differently, for instance, instrumental of vácas is vácasā with the accent on the first syllable instead of the final syllable. There are other differences too. -- 𝓑𝓱𝓪𝓰𝓪 𝓭𝓪𝓽𝓽𝓪^{(𝓽𝓪𝓵𝓴)} 03:38, 3 May 2021 (UTC)Reply

@Bhagadatta, SodhakSH Agreed. s-stems, n-stems and nt-stems are special cases that are not the same as consonant stems. We should follow Whitney in the division of declensions. Benwing2 (talk) 04:03, 3 May 2021 (UTC)Reply

@SodhakSH, Bhagadatta, Benwing2: Sorry for the delay in responding to this. Yeah, I intended this module to be able to handle all of the verbal and nominal morphology but sort of got sidetracked by other work and stopped editing, well, anything. I'm happy to explain anything of the code, though it may take me some time to refamiliarize myself with the code. Also, I'm generally more responsive on Discord if you DM me. My general roadmap was to encode all of the nominal declensions in Whitney's Sanskrit Grammar and then start adding irregular and heteroclitic nouns before attempting to tackle verbs. Also, full disclosure: I am mostly a Vedicist, so that material generally had my priority.

The accentual alternations for the nouns are not so elaborate as people often fear (especially from the standpoint of a Kiparskyan model). They may involve some slightly stronger regular expressions than what this uses, but I'm not too worried. In terms of guiding principles for this code, I wanted to use SLP1 for generation because of its one-sigh-one-sound property, which makes manipulating it much easier and more efficient. In the meantime, it might make sense for someone to create some test cases so that we know all of the sandhi behaviors that this complicated module should have and also to detect the many and varied bugs that can occur (e.g. the non-application of retroflexion of ṇ across compound boundaries like in वृत्रहन् (vṛtrahán)). It's not likely that I'll have time to develop this soon, but I'm certainly happy to advise and troubleshoot. —*i̯óh₁n̥C ^[5] 03:24, 4 May 2021 (UTC)Reply

@JohnC5 Thank you. I'm not on Discord and have never used it; I may have to join though. For now, maybe you can explain the purpose of the various arguments passed as input_table in internal_sandhi: ambig_hint, j_to_z (= lemma j converts to z instead of something else?), final, ambig_hint, has_accent (= there is an accent in the stem or ending? if so why can't this be figured out automatically?), accent_override, mono (= the lemma is monosyllabic?), recessive. Benwing2 (talk) 03:52, 4 May 2021 (UTC)Reply

@Benwing2: I have reviewed and commented the code (with numerous potential FIXME's). It's clear what is there now is not nearly sufficient nor correct, though I can't fix it at the moment. Please tell me if there are more questions. —*i̯óh₁n̥C ^[5] 08:25, 4 May 2021 (UTC)Reply