Module talk:Phli-translit
Add topicAppearance
Latest comment: 6 years ago by Victar in topic Automated transliteration from Pahlavi to Latin is not possible
Doesn't work
[edit]@Victar, I tried turning this on and it didn't work. Could you write some test cases? —*i̯óh₁n̥C[5] 13:44, 17 March 2018 (UTC)
- @JohnC5, fixed. I copy-pasted the characters from the Unicode PDF, which are apparently not encoded to the right characters. --Victar (talk) 15:24, 17 March 2018 (UTC)
Automated transliteration from Pahlavi to Latin is not possible
[edit]The Inscriptional Pahlavi and Book Pahlavi forms of the words do not contain the same amount of information as their Latin transliterations, because several Pahlavi letters correspond to multiple Latin letters, see WT:PAL TR. For example, 𐭣𐭯𐭩𐭥𐭥 should be transliterated to dpywr, while it is automatically transliterated to dpyʿʿ in this entry. --Z 09:37, 18 March 2018 (UTC)
- @ZxxZxxZ: @Victar and I are aware of this, but it better than nothing. We are of the opinion that it is far better to provide something with the knowledge that a smart editor will correct it when there is ambiguity than to allow the multitude of wildly incorrect entries that exist due to people just putting in incorrect entries that could be avoided by a simple transliteration system. Rest assured that we did a lot of research before even working on this. —*i̯óh₁n̥C[5] 10:33, 18 March 2018 (UTC)
- This transliteration system has misled me as a reader. --Z 12:01, 18 March 2018 (UTC)
- @ZxxZxxZ: Would it help if we made a debug category for potentially ambiguous entries so that we can check them? —*i̯óh₁n̥C[5] 12:09, 18 March 2018 (UTC)
- Yes, that would help. --Z 13:07, 18 March 2018 (UTC)
- @ZxxZxxZ: You can find the ambiguous ones at Category:Automatic Inscriptional Pahlavi transliterations containing ambiguous characters. —*i̯óh₁n̥C[5] 22:01, 18 March 2018 (UTC)
- Yes, that would help. --Z 13:07, 18 March 2018 (UTC)
- @ZxxZxxZ: Would it help if we made a debug category for potentially ambiguous entries so that we can check them? —*i̯óh₁n̥C[5] 12:09, 18 March 2018 (UTC)
- This transliteration system has misled me as a reader. --Z 12:01, 18 March 2018 (UTC)
- @ZxxZxxZ, Victar: On a side note, should we transliterate 𐭨 (ṭ) and 𐭤 (h) as Ṭ and E, respectively? And if so, should we, if we detect those characters, automatically treat that word as arameographic? Do arameograms ever take Middle Persian phonetic complements or affixes? The other nice thing about this is that, unless something is an arameogram, the mem is theoretically unambiguous. —*i̯óh₁n̥C[5] 00:29, 19 March 2018 (UTC)
- Ayin and qoph also appear only in arameograms, but arameograms may take affixes, so it's more complicated than that. --Z 06:20, 19 March 2018 (UTC)
- @ZxxZxxZ: In that case, should the module assume waw instead of ayin? —*i̯óh₁n̥C[5] 07:10, 19 March 2018 (UTC)
- Ayin has a distinct letter for Inscriptional Parthian, but for other scripts, we can not distinguish waw and ayin, as we are not sure whether a word is Iranian or an arameogram at least in its entirely. I still believe it is misleading, because we have two different transliteration systems now. --Z 10:45, 19 March 2018 (UTC)
- @ZxxZxxZ: despite of the ambiguity of the characters, I'm with John, and find it a net positive in the end for preventing errors. --Victar (talk) 14:55, 19 March 2018 (UTC)
- That said, we could use more ambiguous stand-ins for these characters, like (w͑˞) or (ẉ̇) for waw-ayin-resh or, heck even simply (*) or (?). --Victar (talk) 16:06, 19 March 2018 (UTC)
- Unfortunately, even if it is a good thing it is not implemented in the correct manner yet, see "gnzʿbʿ" in գանձաւոր for example, which was about to confuse me. I think it should be reverted until we reach an agreement on a good strategy to implement it. --Z 08:35, 5 July 2018 (UTC)
- @ZxxZxxZ: What in the example you are giving that is not "in the correct manner"? This is what the unicode is going to look like -- it's in development -- we so need to get used to it, be it now, or later. Install Yazdgerd, and perhaps you might feel the benefits more. --Victar (talk) 15:18, 5 July 2018 (UTC)
- I think you misunderstood me: The transliteration "gnzʿbʿ" is incorrect and misleading, it should be "gnzwbr" instead. --Z 18:06, 5 July 2018 (UTC)
- I did, but yes, as noted above, due to the ambiguity of the language, we'll have to manually specify certain characters. --Victar (talk) 21:03, 5 July 2018 (UTC)
- I think you misunderstood me: The transliteration "gnzʿbʿ" is incorrect and misleading, it should be "gnzwbr" instead. --Z 18:06, 5 July 2018 (UTC)
- @ZxxZxxZ: What in the example you are giving that is not "in the correct manner"? This is what the unicode is going to look like -- it's in development -- we so need to get used to it, be it now, or later. Install Yazdgerd, and perhaps you might feel the benefits more. --Victar (talk) 15:18, 5 July 2018 (UTC)
- Unfortunately, even if it is a good thing it is not implemented in the correct manner yet, see "gnzʿbʿ" in գանձաւոր for example, which was about to confuse me. I think it should be reverted until we reach an agreement on a good strategy to implement it. --Z 08:35, 5 July 2018 (UTC)
- Ayin has a distinct letter for Inscriptional Parthian, but for other scripts, we can not distinguish waw and ayin, as we are not sure whether a word is Iranian or an arameogram at least in its entirely. I still believe it is misleading, because we have two different transliteration systems now. --Z 10:45, 19 March 2018 (UTC)
- @ZxxZxxZ: In that case, should the module assume waw instead of ayin? —*i̯óh₁n̥C[5] 07:10, 19 March 2018 (UTC)
- Ayin and qoph also appear only in arameograms, but arameograms may take affixes, so it's more complicated than that. --Z 06:20, 19 March 2018 (UTC)