Jump to content

Wiktionary:Khmer romanization

From Wiktionary, the free dictionary

These are the rules concerning transliteration in Khmer entries.

Khmer romanization

 Romanization of Khmer on Wikipedia

The Khmer language is written with the Khmer script, an Indic-based alphasyllabary. There are many methods to romanise the Khmer script; the most common schemes are the United Nations Group of Experts on Geographical Names (UNGEGN) scheme, the Geographic Department scheme which is based on the UNGEGN scheme, the BGN/PCGN scheme and the ALA-LC scheme. All of these schemes use a mix of transcription and transliteration principles (with different proportions of mixing), and as a consequence it is appreciably difficult to algorithmically generate these romanisations in an accurate manner. Monolingual Khmer dictionaries, such as the renowned Chuon Nath Dictionary, traditionally make use of ‘respellings’ to indicate irregularities in pronunciations in a fashion similar to Thai dictionaries, though the use of respellings is not as consistent. The following will attempt to introduce the intricacies of the Khmer script and the romanisations.

Consonants

Consonants Subscript form Class IPA (letter) IPA (before vowel) IPA (first in cluster) IPA (final) UNGEGN (letter) Wiktionary
Transliteration
Wiktionary
Transcription
្ក 1 /kɑː/ /k/ /k/ /k/ k ka

្ខ 1 /kʰɑː/ /kʰ/ /k/ /k/ khâ kh kh
្គ 2 /kɔː/ /k/ /k/ /k/ g k
្ឃ 2 /kʰɔː/ /kʰ/ /k/ /k/ khô gh kh
្ង 2 /ŋɔː/ /ŋ/ /ŋ/ ngô ng ng
្ច 1 /cɑː/ /c/ /c/ /c/ châ c c
្ឆ 1 /cʰɑː/ /cʰ/ /c/ chhâ ch ch
្ជ 2 /cɔː/ /c/ /c/ /c/ chô j c
្ឈ 2 /cʰɔː/ /cʰ/ /c/ chhô jh ch
្ញ 2 /ɲɔː/ /ɲ/ /ɲ/ nhô ñ ñ
្ដ 1 /ɗɑː/ /ɗ/ /ɗ/ /t/ d
្ឋ 1 /tʰɑː/ /tʰ/ /t/ /t/ thâ ṭh th
្ឌ 2 /ɗɔː/ /ɗ/ /t/ t
្ឍ 2 /tʰɔː/ /tʰ/ /t/ thô ḍh th
្ណ 1 /nɑː/ /n/ /n/ /n/ n
្ត 1 /tɑː/ /t/ /t/ /t/ t t
្ថ 1 /tʰɑː/ /tʰ/ /t/ /t/ thâ th th
្ទ 2 /tɔː/ /t/ /t/ /t/ d t
្ធ 2 /tʰɔː/ /tʰ/ /t/ /t/ thô dh th
្ន 2 /nɔː/ /n/ /n/ n n
្ប 1 /ɓɑː/ /ɓ/ /p/ /p/ p b
្ផ 1 /pʰɑː/ /pʰ/ /p/ /p/ phâ ph ph
្ព 2 /pɔː/ /p/ /p/ /p/ b p
្ភ 2 /pʰɔː/ /pʰ/ /p/ /p/ phô bh ph
្ម 2 /mɔː/ /m/ /m/ /m/ m m
្យ 2 /jɔː/ /j/ /j/ y y
្រ 2 /rɔː/ /r/ /Ø/ r r
្ល 2 /lɔː/ /l/ /l/ /l/ l l
្វ 2 /ʋɔː/ /ʋ/ /w/ v v
្ឝ 1 shâ ś s
្ឞ 2 ssô s
្ស 1 /sɑː/ /s/ /s/ /h/ s s
្ហ 1 /hɑː/ /h/ /Ø/ h h
្ឡ 1 /lɑː/ /l/ l
្អ 1 /ʔɑː/ /ʔ/ /ʔ/ ʾ ʾ
Digraph consonants Subscript form Class IPA (letter) IPA (before vowel) IPA (first in cluster) IPA (final) UNGEGN (letter) Wiktionary
Transliteration
Wiktionary
Transcription
ហ្គ 1 /ɡɑː/ /ɡ/ /ɡ/ /k/ h˳g g
ហ្គ៊ 2 /ɡɔː/ /ɡ/ /ɡ/ /k/ h˳g′ g
ហ្ន 1 /nɑː/ /n/ h˳n n
ប៉ 1 /pɑː/ /p/ /p/ /p/ p″ p
ប៊ 2 /ɓɔː/ /ɓ/ p′ b
ហ្ម 1 /mɑː/ /m/ h˳m m
ហ្ល 1 /lɑː/ /l/ h˳l l
ហ្វ 1 /fɑː/
/ʋɑː/
/f/, /ʋ/ /f/ /f/ fâ, vâ h˳v f, v
ហ្វ៊ 2 /fɔː/
/ʋɔː/
/f/, /ʋ/ /f/ /f/ fô, vô h˳v′ f, v
ហ្ស 1 /ʒɑː/
/zɑː/
/ʒ/, /z/ žâ, zâ h˳s ž, z
ហ្ស៊ 2 /ʒɔː/
/zɔː/
/ʒ/, /z/ žô, zô h˳s′ ž, z
Used in phonetic respellings
ញ៉ 1 /ɲɑː/ /ɲ/ nhâ ñ″ ñ
ម៉ 1 /mɑː/ /m/ m″ m
យ៉ ្យ៉ 1 /jɑː/ /j/ y″ y
រ៉ ្រ៉ 1 /rɑː/ /r/ r″ r
ល៉ ្ល៉ 1 /lɑː/ /l/ l″ l
វ៉ ្វ៉ 1 /ʋɑː/ /ʋ/ v″ v
ស៊ ្ស៊ 2 /sɔː/ /s/ /s/ /h/ s s

‘Syllabic configurations’

  • a-series = 1st class; o-series = 2nd class.
  • Note that the combination of diacritics may not be displayed as desired; please consult the column of examples.
Diacritics Examples IPA UN Romanization Wiktionary
Transliteration
Wiktionary
Transcription
a-series o-series a-series o-series a-series o-series a-series o-series
(none) /ɑː/
/ɑ/ (when unstressed in some words)
/ɔː/
/ɔ/ (when unstressed in some words)
â ô a ɑɑ, ɑ ɔɔ, ɔ
កត់ ទប់
យល់
/ɑ/ /u/ (before labial finals)
/ŭə/ (elsewhere)
/ɔ/ (elsewhere, in codaless nonfinal syllables)
á ó á ɑ u, ŭə, ɔ
ស័ក ល័ខ
ទ័ព
/a/ /ĕə/ (before velar finals)
/ŏə/ (elsewhere)
ă
ă a ĕə, ŏə
័យ សម័យ ជ័យ /aj/ /ɨj/ ăy ay ɨy
័រ ជ័រ /ɔə/ ăr ɔə
តា ជា /aː/ /iə/ a éa ā aa
ា់ កាត់ ទាក់
គាត់
/a/ /ĕə/ (before velar finals)
/ŏə/ (elsewhere)
ă
ā́ a ĕə, ŏə
មតិ
កិរិយា
លទ្ធិ
និទាន
/eʔ/ (in stressed syllables)
/e/ (elsewhere)
/iʔ/ (in stressed syllables)
/i/ (elsewhere)
ĕ ĭ i eʾ, e iʾ, i

(with non-glottal coda)
ចិត្ត ជិត /ə/ /ɨ/ i ə ɨ
ិយ ចេតិយ ឥន្ទ្រិយ /əj/ /iː/ iy əy ii
ិះ តិះដៀល ជិះ /eh/ /ih/ iḥ eh ih
បី ពីរ /əj/ /iː/ ei i ī əy ii
ដឹក ទឹក /ə/ /ɨ/ œ̆ œ̆ ə ɨ
ឹះ ឆ្កឹះ គន្លឹះ /əh/ /ɨh/ ẏḥ əh ɨh
ដឺ គឺ /əɨ/ /ɨː/ œ œ ȳ əɨ ɨɨ
វត្ថុ
កុមារ
វិទ្យុ
គុលិកា
/oʔ/ (in stressed syllables)
/o/ (elsewhere)
/uʔ/ (in stressed syllables)
/u/ (elsewhere)
ŏ ŭ u oʾ, o uʾ, u

( with non-glottal coda)
កុន គុណ /o/ /u/ ŏ ŭ u o u
ុះ ចុះ ពុះ /oh/ /uh/ ŏh ŭh uḥ oh uh
កូរ គូ /ou/ /uː/ o u ū ou uu
ូវ ត្រូវ នូវ /əw/ /ɨw/ ūv əw ɨw
កួរ គួរ /uə/ /uə/ ua
បើ ឈើ /aə/ /əː/ aeu eu oe əə
ើះ ចង្កើះ /əh/ oeḥ əh
តឿ ជឿ /ɨə/ /ɨə/ œă œă ẏa ɨə ɨə
តៀប ទៀប /iə/ /iə/ īa
កិរ្តិ៍ គេ /eː/ /ei/ é é e ee ei
េច
( before palatals)
ម៉េច
ចេញ
ភ្លេច
ពេញ
/ə/ (before palatals) /ɨ/ (before palatals) e ə ɨ
េះ សេះ នេះ /eh/ /ih/ éh éh eḥ eh ih
កែ គែ /ae/ /ɛː/ ê ê ae ae ɛɛ
ែះ កែះ /eh/ aeḥ eh
ប្រៃ ព្រៃ /aj/ /ɨj/ ai ey ai ay ɨy
កោរ គោ /ao/ /oː/ o ao oo
ោះ កោះ គោះ /ɑh/ /ŭəh/ aôh ŏăh oḥ ɑh ŭəh
តៅ ទៅ /aw/ /ɨw/ au ŏu au aw ɨw
ុំ ដុំ ទុំ /om/ /um/ om ŭm uṃ om um
ចំ ទំ /ɑm/ /um/ âm um aṃ ɑm um
ាំ ចាំ ជាំ /am/ /ŏəm/ ăm ŏăm āṃ am ŏəm
ាំង តាំង ទាំង /aŋ/ /ĕəŋ/ ăng eăng āṃng ang ĕəng
តះ ទះ /ah/ /ĕəh/ ăh eăh aḥ ah ĕəh
វណ្ណៈ ជីវៈ /aʔ/ /ĕəʔ/ ă à ĕəʾ

Independent vowels

  • Note that words spelt with independent vowels should always have respellings in entries, for example ឩកា (ʼuukaa) should be respelt as អ៊ូកា.
  • Also note that the independent vowel (ʼâ) is different from the consonant sign (ʼɑɑ). On Wiktionary, only the latter should be used in entries.
Independent
vowels
UN romanization IPA
â /ʔɑʔ/
អា a /ʔa/
ĕ /ʔe/
ei /ʔəj/
ŏ /ʔ/
ŭ /ʔu/
ŏu /ʔɨw/
rœ̆ /ʔrɨ/
/ʔrɨː/
lœ̆ /ʔlɨ/
/ʔlɨː/
é /ʔeː/
ai /ʔaj/
, aô, aôy /ʔaːo/
âu /ʔaw/

Diacritics

Diacritics Name Notes
() nɨkkĕəʾhət (និគ្គហិត) niggahita; nasalizes the inherent vowels and some of the dependent vowels, see anusvara, sometimes used to represent [aɲ] in Sanskrit loanwords
() rĕəh muk (រះមុខ) "shining face"; adds final aspiration to dependent or inherent vowels, usually omitted, corresponds to the visarga diacritic, it maybe included as dependent vowel symbol
() yukuəl pintuʾ, yukĕəʾlĕəʾ pintuʾ (យុគលពិន្ទុ) yugala bindu ("pair of dots"); adds final glottalness to dependent or inherent vowels, usually omitted
() muusekaʾtŏən (មូសិកទន្ត) mūsikadanta ("mouse teeth"); used to convert some o-series consonants to the a-series
() trəysap (ត្រីសព្ទ) trīsabda; used to convert some a-series consonants to the o-series
() kbiəh kraom (ក្បៀសក្រោម) also known as bok cəəng (បុកជើង); used in place when the diacritics trəysap and muusekaʾtŏən impede with superscript vowels
bɑntɑk (បន្តក់) used to shorten some vowels
() rɔbaat (របាទ)
reiphaʾ (រេផៈ)
rapāda, repha; behave similarly to the tŏəndĕəʾkhiət, corresponds to the Devanagari diacritic repha, however it lost its original function which was to represent a vocalic "r"
() tŏəndĕəʾkhiət (ទណ្ឌឃាដ) daṇḍaghāta; used to render some letters as unpronounced
() kaak baat, kaakaʾ baat (កាកបាទ) kākapāda ("crow's foot"); more a punctuation mark than a diacritic; used in writing to indicate the rising intonation of an exclamation or interjection; often placed on grammatical particles such as /na/, /nɑː/, /nɛː/, /vəːj/, and the feminine response /cah/
() ʾahstaa (អស្តា) denotes stressed intonation in some single-consonant words[1]
() sangyook saññaa (សំយោគសញ្ញា) represents a short inherent vowel in Sanskrit and Pali words; usually omitted
() viriəm (វិរាម) a mostly obsolete diacritic, corresponds to the virāma
() cəəng (ជើង) a.w. coeng; a sign developed by Unicode​ to input subscript consonants, appearance of this sign varies among fonts

References