Northern Sámi IPA module
I'm not sure I understand your suggestion. Could you give some examples?
In any case, stress is more or less predictable when the foot boundaries are known (i.e. first syllable of every foot is stressed), but I don't think there is a way for an algorithm to accurately determine where those boundaries are, even for ones with four syllables. Take for instance buotveagalaš "all-mighty", which has four syllables, but whose syllable-per-foot count is divided 1-3 rather than 2-2.
Loan words like studeanta offer additional complications, i.e. having two feet instead of the expected one.
I think the best solution is just to allow the user to input how many syllables there are per foot (e.g. inputting something like 1,3 in the template for buotveagalaš or 1,2 for studeanta). Template:fi-pronunciation uses dashes for compounds to indicate where the stresses are (so something like spelling out buot-veagalaš with the template).
Maybe extra inspiration can be found in other templates (though I haven't looked through all of them myself)
Oh, that's actually a pretty neat solution! I hadn't thought of that yet.
My question was rather one of notation. To take the word álbmotjienasteapmi as an example. It's a compound of two words: one of two syllables and one of four syllables. How would you denote the pronunciation of this word in IPA, including stress markers?
If I'm reading your comments correctly, you want to distinguish between secondary and tertiary stress for long words like this. I would just transcribe all of the stresses beside the first one as secondary stresses, as I'm not sure how else to transcribe it.
For Western dialects (e.g. Kautokeino), I'd transcribe it like /ˈaːlːpmohˌjie̯naˌstea̯pmiː/ in broad transcription and [ˈɑːlɑ̯pmohˌjie̯nɑˌsteæ̯pmiː] in narrow transcription. The narrow transcription may not be relevant here, but I've added it for completion's sake. There are a few things to comment on:
- How my broad transcription differs from the generated output:
- <t> is pronounced like /h/ at the end of the first lexeme
- The consonant cluster <st> is completely after the syllable boundary instead of being split by it
- How my narrow transcription differs from both:
- The vowel qualities: [ɑ] instead of /a/; [eæ̯] instead of /ea̯/
- The glide vowel (typically transcribed as /ə/) in the cluster <lbm>
The stress marking as transcribed by the module is mostly correct in this instance, except the cluster being split by the syllable boundary.
I was not aware of the syllabification of <st>. Are there any other consonant clusters that behave that way?
In general I've had some difficulty finding detailed sources on Northern Sami phonology, which is why the template only describes the Kautokeino dialect. That's the only one described by a source I could find. If you know more, the Wikipedia article about Northern Sami could definitely do with some love.
The Oxford Guide to the Uralic Languages, chapter 10 "North Sami" (https://doi.org/10.1093/oso/9780198767664.003.0010), is the most detailed work on Northern Sámi phonology I've found, but it is not entirely comprehensive (e.g. very little about glide vowels (they call them "subglottal pulses"), nothing about preglottalised nasals, no analysis of coastal dialects or Torne dialects, etc.), but it still covers a lot. Another comprehensive work is "An analysis of North Saami gradation" in the journal Phonology (https://doi.org/10.1017/S0952675712000115), though it is mostly about the phonetic realization of consonant grades. My technical understanding comes mostly from the first work with an additional splattering of works to fill in some gaps, including "Samisk grammatikk" by Klaus Peter Nickel (ISBN 82-7374-201-6). "The Saami languages : an introduction" by Pekka Sammallahti (ISBN 82-7374-398-5) also goes into great detail on pronunciation, though it does use the Uralic Phonetic Alphabet instead of the IPA.
Also, it seems I was partially incorrect in my comment on the consonant cluster <st>; the consonant clusters <sk>, <šk> and <st> are optionally split at syllable boundaries, or appear entirely after them (e.g. either jienas-tit or jiena-stit). When they are part of the beginning of compound words, they will always come after (e.g. čála-standárda). The clusters <bm>, <dn>, <dnj> and <gŋ> always appear after the syllable boundary when they come after another consonant (e.g. vuoi-gŋa, bár-dni, fier-bmi), per "Samisk grammatikk", p. 33-34. The last ones are unlikely to come up, as I don't think they can ever appear between feet.