Module:pi-translit/documentation

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Interfacing

[edit]

This module will transliterate Pali language text. It is also used to transliterate Sanskrit. The module should preferably not be called directly from templates or other modules. To use it from a template, use {{xlit}}. Within a module, use Module:languages#Language:transliterate.

For testcases, see Module:pi-translit/testcases.

Functions

tr(text, lang, sc)
Transliterates a given piece of text written in the script specified by the code sc, and language specified by the code lang.
When the transliteration fails, returns nil.
trwo(text, lang, sc, options)
Transliterates a given piece of text written in the script specified by the code sc, language specified by the code lang, and writing system specified by the optional argument options. When the transliteration fails, returns nil.
The table options may contain the following fields:
Name Description Scripts relevant for Default
impl Whether the writing system uses implicit vowels. The valid values are 'yes' and 'no'. The value 'both' causes the field to be ignored. The transliteration will normally determine whether the writing system uses implicit vowels by examining the text; this field is only used if examination of the text is inconclusive. Thai, Laoo 'yes'
y How <y> is written. The letter NYO (ຍ) is used if the value is 'ຍ' or 'yung', and the letter YO (ຢ) is used if the value is 'ຢ' or 'yaa'. If YO is used, then NYO is assumed to be used for the letter <ñ> Laoo If the text contains 'ຢ' then the default is 'yaa'; otherwise it is 'yung'.
Other fields will be ignored.
The fields correspond to writing system parameters of the inflection templates {{pi-decl-noun}} and {{pi-conj-special}}.

This module transliterates Pali text from the Brahmi, Bengali, Burmese, Devanagari, Khmer, Lanna, Lao, Sinhala and Thai scripts in accordance with the IAST convention, but with rather than for the retroflex lateral. It will also transliterate Sanskrit in so far as the Sanskrit writing system is known.

Method

[edit]

The Wiktionary transliterations of Burmese, Khmer, Lao and Thai are completely incompatible with IAST, and when standardised, it is likely that the Northern Thai (and Lao, Tai Khuen and Tai Lue) convention will also be incompatible. The Sanskrit Devanagari transliteration is compatible, and Module:sa-translit is therefore used. The Brahmi and Sinhalese transliterations are different from IAST, but for Pali and Sanskrit their outputs are converted to IAST, so Module:Brah-translit and Module:si-translit are used for these scripts.

The core of the transliteration is the conversion of CV? sequences where V is a vowel or a mark of its absence. Unlike non-pristhamatra Devanagari, a vowel may consist of up to four characters, even in form NFC. The V category includes dependent vowels, viramas, coengs and cancellation marks. The C category includes base consonant, subscript consonants, and the initial parts of multicharacter independent vowels. The CV combination is considered for special translation; this deals with several potentially awkward combinations.

There are also the complications of subscript consonants (in the Burmese and Lanna scripts), nuktas (for Bengali) and ZWNJ. Subscript consonants are handled by Romanising consonants preceding subscript consonants so that the implicit vowel will not be inserted after them.

The handling of preposed Thai and Lao vowels is handled by noting that in abugidas, they may be moved forward past each consonant with a phinthu, and then swapped with one consonant without a phinthu. Octahedron80 advises that in the alphabetic systems, the vowel is written before the last consonant of a cluster.