Jump to content

Module talk:tg-translit

Page contents not supported in other languages.
Add topic
From Wiktionary, the free dictionary
Latest comment: 2 months ago by Babr in topic ӣ and и

A piece of Tajik text: Zabon-i tojiki, ki dar Eron: forsi, va dar Afġoniston dari nomida mešavad, zabon-i davlati-yi kišvarho-yi Tojikiston, Eron va Afġoniston mebošad. Dar Üzbakiston, agarči zabon-i aqalliyat tojiki mahsub mešavad, vale dar Üzbakiston ziyoda az 15 million nafar ba tojiki guftugü mekunand. In zabon ba xonavoda-yi zabonho-yi hindu-avrupoi doxil mešavad. Dar majmüʾ: porsigüyon-i asil(forsi, tojiki, dari) ziyoda az 122 mln mardum mebošand. Ammo tamom-i porsigüyon-i jahon 222 mln hastand. Faqat ba güyiš-i tojiki 44 million nafar gap mezanand. Zabon-i točiki, yake az zabonho-yi bostontarin-i jahon ba šumor meravad. Davra-yi nav-i inkišof-i on dar asrho-yi 7-8 sar šudaast. Bo in zabon šoyironu navisandagon-i buzurg Rüdaki, Firdavci, Xayyom, Sino, Jomi,Mavlono, Hofiz, Doniš, Ayni, Lohuti, Tursunzoda va digaron asarho ejod kardaand.

Testing conversion of "е" after vowels and "ъ". AyeOyeUyeEyeYayeYoyeYuyeIyeIyeYEeʾyeayeoyeuyeeyeyayeyoyeyuyeiyeiyeyeeʾe --Anatoli (обсудить/вклад) 23:37, 17 April 2013 (UTC)Reply

И, и

[edit]

Letter И, и should be I, i, not Yi, yi. Which part of code does it?

In the string Ин ин Ин. ин Ин, only the first character is transliterated correctly see test: In in In. in In --Anatoli (обсудить/вклад) 04:01, 18 April 2013 (UTC)Reply

Fixed. --Z 04:09, 18 April 2013 (UTC)Reply
Thank you! Does "%A" matches beginning of a line? Which documentation are you using, I'd like to read a bit more. --Anatoli (обсудить/вклад) 04:27, 18 April 2013 (UTC)Reply
NP. %A is "all characters not in %a", and %a represents all ASCII letters, so %A includes white spaces as well, which we don't want. Here is a comprehensive documentation. --Z 06:30, 18 April 2013 (UTC)Reply
Thank you for the link and explanation. Something happened, though The test: AyeOyeUyeEyeYayeYoyeYuyeIyeIyeYEeʾyeayeoyeuyeeyeyayeyoyeyuyeiyeiyeyeeʾe doesn't produce what it should at the moment. "е" after vowels should be "ye". User:Dijan knows the exact details. --Anatoli (обсудить/вклад) 06:36, 18 April 2013 (UTC)Reply
My mistake... fixed now. --Z 06:42, 18 April 2013 (UTC)Reply
It occurs to me that the sequence аа should be transliterated "aya" (for verb forms like кардаам, гирифтаанд, etc.) — [Ric Laurent]13:38, 18 April 2013 (UTC)Reply
No. The rules of epenthesis are the same for both Farsi and Tajiki for their corresponding vowels. The only ones I'm mentioning here are the ones that aren't obvious in the written form. In the examples of verb forms that you cited, they correspond to Farsi کرده‌ام (karde'am) and گرفته‌اند (gerefte'and). In literary/formal Tajiki, as in literary/formal Farsi, they are pronounced with the glottal stop, while in colloquial/spoken they both omit the glottal stop and only one "a" is pronounced (the "e" being dropped in the case of Farsi and "a" being pronounced instead). --Dijan (talk) 15:50, 18 April 2013 (UTC)Reply
I've seen it written quite specifically that in Tajik it's /aja/, but whatever, I don't know anything. — [Ric Laurent]00:41, 19 April 2013 (UTC)Reply
Now I'm curious. As far as I remember, I don't think it is. If it was it would be written ая. Tajiki compensates for "ya", "yo", and "yu" in spelling. But, if you can find where it was written as such, let me know. :) --Dijan (talk) 05:25, 19 April 2013 (UTC)Reply
From Tajiki Reference Grammar for Beginners by Nasrullo Khojayori and Mikael Thompson (which, for the record, uses a wonderfully modest Cyrillic font):
"However, the pronunciation differs from the spelling (which is purely historical)."
"The present perfect is formed by adding the predicate endings to the past participle. Note that (1) the 3rd singular аст is written joined to the participle, and (2) although a й is sometimes added between the participle and the predicate endings, it is not indicated in writing with yoted letters (thus хондаам can be pronounced [хондаям])."
[Ric Laurent]13:06, 19 April 2013 (UTC)Reply
I'm not sure what that means exactly. He says "can be pronounced", but not that it actually is in the standard/literary language. It's possible that it is a dialectal feature. He points out various differences in pronunciation between the northern and southern dialects.
This is what Shinji Ido says in "Tajik" (2005) about pronominal clitics on page 26, "The 'buffer' sound /j/ ... is inserted between a vowel other than /a/ and a pronominal clitic that follows it."
An example of this would be хонаам (xona-ammy house), which would be pronounced with a glottal stop after the last syllable in хона in literary language, and without the glottal stop with only one а in colloquial.
John R. Perry in "Tajik Persian Reference Grammar" (2005) says "A euphonic -y- is inserted after a word ending in a vowel other than -a;". --Dijan (talk) 08:51, 20 April 2013 (UTC)Reply
Thanks, ZxxZxxZ. @Dijan. It seems WT:TG TR needs a bit of notes on how transliteration should work. --Anatoli (обсудить/вклад) 00:08, 19 April 2013 (UTC)Reply

Ӯ ӯ

[edit]

I was recently editing the Wikipedia article on the Tajik alphabet, and it seems that the usual transliteration of the letter ӯ (ü) is ū, but here on Wiktionary it's ü. (It was changed from ū to ü in this edit by @Dijan.) The pronunciation of the letter is /ɵː/ according to Wikipedia (though I doubt the vowel is actually distinctively long, since quality serves to distinguish it from the other vowels). In most languages (for instance, German, Turkish, Hungarian), ü represents /y/ or /ʏ/.

Ū suggests the value /uː/, which is rather far from the real Tajik pronunciation. (It's not even historically correct: I gather from the table in w:Persian phonology#Historical shifts that the vowel descends from Early New Persian /oː/. A historical transcription would be ō.) So it's a very misleading transliteration to use, even though it is a direct representation of the Cyrillic letter ӯ (ü), composed of у (u) plus a macron.

The typical pronunciation of ü in other languages is much closer to the Tajik pronunciation of ӯ (ü): it's front or near-front and rounded. But I think it would make far more phonetic sense to transcribe ӯ (ü) as ö, which has the vowel quality [ø] or [œ] in German, Turkish, Hungarian, and Finnish. These symbols are canonically defined as front, but rounded front vowels are often somewhat centralized: i.e., closer to [ɵ], the Tajik vowel. So, ö would be the best transliteration, if the transliteration is meant to suggest the phonetic value. Otherwise, it would be better to go back to the standard transliteration, ū. — Eru·tuon 07:56, 22 December 2016 (UTC)Reply

I should probably ask: @Dijan, why did you change the transliteration of ӯ (ü) from ū to ü? I think ü is probably more understandable phonetically, but ö would be even better, since the Tajik vowel is mid like ö, not close like ü. — Eru·tuon 08:01, 22 December 2016 (UTC)Reply

Unfortunately, I do not recall exactly why this was changed in 2011. In my opinion, ū is misleading as it suggests just a longer variant of u. I believe it was probably to differentiate from regular u but also keep it aesthetically similar to u for transliteration purposes. However, feel free to change it to whatever you think is more appropriate. Regarding "usual" or standard transliteration vs Wiktionary, as long as I have been here, Wiktionary has always opted for its own transliteration standards. Whether they are based on other systems or not is somewhat irrelevant on Wiktionary. Dijan (talk) 14:08, 22 December 2016 (UTC)Reply
Personally, I prefer ū because of its graphical similarity to the Cyrillic; I don't think we can represent it well phonetically whatever we might do. But I don't speak Tajik. —Μετάknowledgediscuss/deeds 17:40, 22 December 2016 (UTC)Reply

ӣ and и

[edit]

Hi @Babr. I don't know if it was a good idea to make both letters transliterate as "i" and insert hyphens before the final "и". It's not always ezâfe: босмачи (bosmač-i). Anatoli T. (обсудить/вклад) 04:53, 20 September 2024 (UTC)Reply

@Atitarev ???, Vozhaju lists this word as босмачӣ not босмачи. — BABRtalk 16:57, 20 September 2024 (UTC)Reply
@Babr: I checked a few Tajik resources before posting (there are also Uzbek Cyrillic with the same spelling). It may not be 100% accurate but more common. In any case, the merge to match Persian is not fully justified. Also, you never responded to the question regarding the Persian transliteration of ع, just reminding, in case you missed. Thanks for responding. Anatoli T. (обсудить/вклад) 03:08, 21 September 2024 (UTC)Reply
@Atitarev It seems many of them are Uzbek and News Agencies from Tajikistan overwhelmingly use босмачӣ. Ezafe is an extremely basic inflection that is used on Russian loanwords too. If, for example, босмач was a word, босмачӣ and босмачи would mean different things. I don't think we should try to regularize Tajik based on Russian orthography especially when it's clear News Agencies have a preference for what they think is correct.
That change of the transliteration of ع was based on a conversation between @Fenakhay, @Saranamd and me. Feel free to discuss it with them, I honestly do not care what the transliteration is, tbh. — BABRtalk 03:48, 21 September 2024 (UTC)Reply
@Babr: Hi, thanks for the reply and your efforts, the Persian transliterations are much better but I think you overdid the Tajik.
  1. Tajik босмачи is of Turkic origin, most likely borrowed from Uzbek where -chi/-чи is a suffix, modern Latin speling bosmachi, the plural is w:tg:босмачиҳо. босмач (bosmač) with no ending is not a word in Tajik.
  2. The Russian singular term басма́ч (basmáč), is a back formation from the same source - Ubek bosmachi, which was perceived as plural by Russians. The plural is басмачи́ (basmačí).
I oppose the introduction of the hyphen "-" for Tajik ezâfe because the spelling/transliteration/pronunciation are very close in Tajik and hyphens are used in the orthography. The specific word aside, I mean why do we need to hyphenise the ezâfe духтари хушрӯ (duxtar-i xušrü, a beautiful girl) if Tajiks themselves don't? And why should letters ӣ (i) and и (i) produce the same result "i" when they are different letters and sounds?
I don't know many cases but in босмачи, the last syllable is stressed and it's part of the word, not a suffix. босмачигӣ (bosmačigi) and босмачигарӣ (bosmačigari) are derivations, both meaning "Bosmachi movement". I don't care about being close or not to the Russian orthography, it's just the way it is. The spelling босмачӣ (bosmači) is also found, also as an adjective.
I will ping you people about the Persian ع separately in a more appropriate page but pls let me know if there was a Wiktionary discussion I can read. Anatoli T. (обсудить/вклад) 01:43, 22 September 2024 (UTC)Reply
@Atitarev, Again there is no reason why Tajik spelling should be analyzed based on other languages when News agencies in Tajikistan already make it clear which spellings they view as being more correct. Also, why do we put hyphens in Arabic and Persian translations when they themselves do not? Transliterations are not 1-1 to the languages orthography because their purpose is to make it easy to understand by the reader. The fact that a language does not do something in its native orthography has never been a reason not to make its transcription more clear. That is not a standard we currently hold other translations too.

"I don't know many cases but in босмачи the last syllable is stressed and it's part of the word, not a suffix"
-ӣ does NOT imply that the ending is a suffix (though it can be, its purpose is not to mark a suffix). Actually, it is -и that is always suffix and has an unstressed pronunciation and is not considered part of the word. Whereas "босмачӣ" would be read as an independent grammatical object, "босмачи" would be possessive or a modified noun. Also final -ӣ always becomes -и when pluralized? In fact the article you yourself linked only uses the singular form "босмачӣ", the only times where "босмачи" is used is in the plural or inflected forms (e.g. Босмачиён, Босмачиҳо, notable since -иён is only used for words ending in -ӣ). I am well aware that "босмач" is not a word in Tajik, I quite literally said "for example" because what I am trying to explain is that "босмачи" implies either possession or description of "bosmač" by the following subject or adjective. — BABRtalk 06:35, 22 September 2024 (UTC)Reply