Module:km-etym
- The following documentation is located at Module:km-etym/documentation. [edit]
- Useful links: subpage list • links • transclusions • testcases • sandbox
Todo: move most of this to a "Khmer derivation" page, and keep just important information for module usage.
Affixes
Overview
This part relies broadly on Jenner (1969) and Jenner & Pou (1980). But Jenner's theoretical position is totally untenable and won't be followed here. He basically assumes that all basic roots are CVC and that all CCVC words are in fact secondary developments by prefixation or infixation of a base CVC root. Moreover, he posits that all CVN- prefixes (ban-, kan-, lum-,...) must all derive semantically and morphologically first from the prefixation of C- onto the root then from the infixation of a syllabic nasal -aN-. This extreme theoritical stance will not be followed. Indeed, there are many cases where CVN+Root words share a meaning with Root, and none with C+Root (when it exists!), which in fact is a separate, unrelated root of its own.
Notation
Following a widespread convention, the vowel of the syllabic affixes is represented by the metaphoneme "V", and the nasal coda by "N".
Jenner also uses the metaphoneme "L", corresponding to /r/ or /l/ depending on the phonological context, but the alleged allophonic distribution between both phonemes is not convincing (for example, both /r(ɔ)-/ and /l-/ can appear in front of /b-/). Consequently Jenner won't be followed on this point.
Note that some prefixes' initial should have been notated with a metaphoneme too (like /bɑn-/ ~ /pʊən/ > theoritical notation BVN-) but this alternation being seen as a direct consequence of the choice in the vowel's series, this notation is generally not adopted.
Missing root
It often happens that the root of a derived word doesn't occur in the modern language and was lost to time. In this situation it is difficult to determine the meaning of the root and the function of the affix, if it ever had one (for instance, in kaŋhaəp "frog" = kVN "?" + haəp "?"). Sometimes though, more than one derived forms have survived, which allows to reconstruct an hypothetical root (for instance, "srɑlaɲ", "sɑmlaɲ -> *laañ "to love or be loved) ). A star should precede such roots.
Functions
Overview
This relies very broadly on Jenner (1969) 's nomenclature, although it's very often difficult to agree with how he decided to categorize many derived words, and how specific some functional categories appear to be. For these reasons, this module will stick to the following guidelines:
- it will keep only a few broader functional categories (for example, no need to differentiate factitive from causative, habitual from frequentative... Clearly, Khmer's affixes aren't as clear cut function-wise to call for that much detailing).
- since Khmer words can very easily slide from noun to adj to verb to adverb without any overt morphological change, functions should be named based on how affixes affects the *meaning* of the base root, not on how it makes it change parts of speech (so no function should be called "nominalization", "verbalizations", etc..).
- even functions can easily slide from one meaning to the next. For example, processive can easily endorse a resultative meaning, but the opposite is rarely true if at all. In this case, it should be categorized as processive. Also, just because a verb is telic (= has an inherent end) does not mean that its nominalization is resultative, it is still processive. Jenner is very often culprit of that. For example, "information" "announcement" are processive, not resultative.
Reduplications
Overview
This follows Ehrman (1972)'s classification. There are 4 types of reduplications:
- rhyming reduplications (2nd and 3rd segments are kept): CVC ?VC
- alliterative reduplications (1st segment is kept): CVC C??
- ablauting reduplications (1st and 3rd segments are kept): CVC C?C
- total reduplications (all segments are kept).
Cases where 1st and 2nd segments are kept are rare and will be classified as alliterative.
Non-dummy echo syllable
The reduplicated syllable (the "echo") can occupy the first or the second position, and is usually a dummy syllable void of meaning and not found as a standalone word elsewhere in the language. But sometimes, the echo syllable does exist as a standalone word, in which case it can one of 2 things:
- the identification with the standalone word is fortuitous and the echo syllable does not carry over its meaning (it is still a dummy syllable).
- the echo syllable is the standalone word and forms with the other syllable an "echo-tautological" compound (Renner 2005), called here a reduplicative synonymous compound, where both words are both close semantically and phonetically (like English tiny-weeny, moan and groan).
In the first case, the template should be linked to the coincidental word using parameters l2 and t2. In the second case, the template should come after a synonymous compound template {{compound|km|type=syn}}
.
--[[
Read the documentation before any change.
]]
local export = {}
local gsub = mw.ustring.gsub
local sub = mw.ustring.sub
local match = mw.ustring.match
local namespace = mw.title.getCurrentTitle().nsText
-- 0 = redup, 1 = prefix, 2 = infix
local affixes = { ["R-"] = 0,
["p-"] = 1, ["t-"] = 1, ["c-"] = 1, ["k-"] = 1,
["s-"] = 1, ["r-"] = 1, ["l-"] = 1, ["m-"] = 1,
--["RrV-"] = 1,
["prV-"] = 1, ["trV-"] = 1, ["crV-"] = 1,
["krV-"] = 1, ["srV-"] = 1,
--["RVN-"] = 1,
["bVN-"] = 1, ["dVN-"] = 1, ["cVN-"] = 1,
["kVN-"] = 1, ["sVN-"] = 1, ["rVN-"] = 1,
["lVN-"] = 1, ["aN-"] = 1,
["-b-"] = 2, ["-m-"] = 2, ["-n-"] = 2,
["-r-"] = 2, ["-l-"] = 2, ["-h-"] = 2,
-- The following two are allomorphs, should be merged?
["-VN-"] = 2, ["-Vmn-"] = 2,
-- The following two are allomorphs of -n- and -m-, keeping them separate for now
["-rVn-"] = 2, ["-rVm-"] = 2}
-- Not used
local affixes_km = { ["R-"] = 0,
["ប-"] = 1, ["ត-"] = 1, ["ច-"] = 1, ["ក-"] = 1,
["ស-"] = 1, ["រ-"] = 1, ["ល-"] = 1, ["ម-"] = 1,
["ប្រ-"] = 1, ["ត្រ-"] = 1, ["ច្រ-"] = 1,
["ក្រ-"] = 1, ["ស្រ-"] = 1, ["បំ-"] = 1,
["ដំ-"] = 1, ["ចំ-"] = 1, ["កំ-"] = 1,
["សំ-"] = 1, ["រំ-"] = 1, ["លំ-"] = 1,
["-ប-"] = 2, ["-ម-"] = 2, ["-ន-"] = 2,
["-រ-"] = 2, ["-ល-"] = 2, ["-◌ំ-"] = 2,
["-◌ំន-"] = 2, ["-រន-"] = 2}
--
--[[ Important:
# Since Khmer words can very easily slide from noun to adj to verb to adverb without any overt morphological change, functions should be named based on how affixes affects the *meaning* of the base root, not on how it makes it change parts of speech.
# As a consequence, do NOT create functions called "nominalization", "verbalizations", etc...
]]
local functions ={
-- Valency modifiers, verbalizations
["caus"] = "causative", -- factitive, transitivizing
["recip"] = "reciprocal",
["attrib"]= "attributive", -- qualitative, adjectival
-- Semantic shifts
["spec"] = "specializing", -- similative, figurative, directional,
-- sometimes no particular meaning added
-- more of a catch-all category
["intens"]= "intensive", -- emphatic, directional, augmentative
["rep"] = "repetitive", -- frequentative, habitual, distributive
-- Nominalizations
["agent"] = "agentive",
["instr"] = "instrumental",
["loc"] = "locative",
["proc"] = "processive", -- action noun, gerund, sometimes resultative too
["res"] = "resultative",
-- Numbers
["sing"] = "singularizing",
["coll"] = "collective"
}
-- redup
local redup_modes = {
["rhy"] = "rhyming", ["rhyme"] = "rhyming",
["allit"] = "alliterative",
["ablaut"] = "ablauting"
--[nil]="", --used for total reduplication
}
function get_redup_mode(mode)
if mode == nil or mode == "" then
return "reduplication"
end
mode_text = redup_modes[mode]
if mode_text == nil then
error("Reduplication mode could not be recognized")
end
mode_text = mode_text.. " reduplication"
return mode_text
end
function get_base_only_text(root)
text=""
if root ~= nil and root ~= "" and root ~= "-" and sub(root, 1,1) == "-" then
text =" This base is only found in derived forms."
end
return text
end
function export.affix(frame)
local args = frame:getParent().args
root = args[1]
gloss = args[2]
affix = args[3] or args["af"]
func = args[4]
ultim = args["ultim"]
compound = args["compound"]
nocap =args["nocap"]
nopunct =args["nopunct"]
tr = args["tr"]
affix2 = args["af2"]
root_text = ""
if root == "-" then
root_text = "form"
elseif root ~= nil and root ~= "" then
root_text = "from " .. frame:expandTemplate{ title = "m", args = {"km", root, nil, gloss = gloss, sc = "Khmr", tr = tr}}
end
affix_text = ""
affix_cat_text = ""
affix_type = affixes[affix] or error("Affix could not be recognized")
if affix_type == 0 then
affix_text = "reduplicated initial"
affix_cat_text = affix_text
else
if affix_type == 1 then
affix_text = "prefixed"
elseif affix_type == 2 then
affix_text = "infixed"
end
affix_cat_text = affix_text .. " ".. affix
affix_text = affix_text.. " ''[[".. affix.. "]]''"
end
-- make a function...
affix2_text = ""
affix2_cat_text = ""
if affix2 ~= nil then
affix2_type = affixes[affix2] or error("Affix2 could not be recognized")
if affix2_type == 0 then
affix2_text = "reduplicated initial"
affix2_cat_text = affix2_text
else
if affix2_type == 1 then
affix2_text = "prefixed"
elseif affix2_type == 2 then
affix2_text = "infixed"
end
affix2_cat_text = affix2_text .. " ".. affix2
affix2_text = affix2_text.. " ''[[".. affix2.. "]]''"
end
affix2_text = " and ".. affix2_text
end
func_string = ""
func_text = ""
if func ~= nil then
func_string = functions[func] or error("Function could not be recognized")
end
if func_string ~= "" then
func_text = " (" .. func_string .. ")"
end
compound_text = ""
redup_text =""
if compound ~= nil then
compound_text = ", applied to each term of the"
if compound=="1" then
compound_text = compound_text.. " compound"
else
redup_text= get_redup_mode(compound)
compound_text = compound_text.. " ".. redup_text
end
end
ultim_text = ""
if ultim ~= nil then
ultim_text = ", ultimately from ".. frame:expandTemplate{ title = "m", args = {"km", ultim, nil, gloss = nil, sc = "Khmr"}}
end
if root_text ~= "" then
text = root_text
text = text .. " with "
text = text .. affix_text.. func_text
text = text .. affix2_text
text = text .. compound_text
text = text .. ultim_text
if nopunct == nil then
text = text .. "."
end
text = text.. get_base_only_text(root)
if nocap == nil then
text = mw. getContentLanguage():ucfirst(text)
end
else
text="''[[".. affix.. "]]''"
end
-- Categories
if namespace == "" then
-- Will activate affix categories later
-- text = text.."[[Category:Khmer terms with ".. affix_cat_text.."]]"
if affix2 ~= nil then
-- text = text.."[[Category:Khmer terms with ".. affix_cat_text.."]]"
end
-- will activate function categories later
if func_string ~= "" then
-- text = text.."[[Category:Khmer terms with ".. func_string.." affix]]"
end
if compound ~= nil then
if compound == "1" then
text = text.."[[Category:Khmer compound terms]]"
elseif redup_text ~= nil then
text = text.. "[[Category:Khmer " .. redup_text.. "s]]"
end
--text = text.."[[Category:Khmer compounds with duplicated affix]]"
end
end
return text
end
function export.redup(frame)
local args = frame:getParent().args
lemma= args[1]
gloss= args[2]
mode = args[3] or args["type"]
lemma2 = args["l2"]
gloss2 = args["t2"]
tr = args["tr"]
nocap =args["nocap"]
mode_text = get_redup_mode(mode)
article = "a"
if sub(mode_text, 1, 1) == "a" then
article = "an"
end
text = ""
-- when lemma isn't provided, it is assumed it comes after a {{compound|km|type=syn}} template, that this module further qualifies.
if lemma == nil or lemma == "" then
text = mode_text
nocap = 1
else
text = mode_text.. " of ".. frame:expandTemplate{ title = "m", args = {"km", lemma, nil, gloss = gloss, sc = "Khmr", tr = tr}} .. "."
end
--
text = text.. get_base_only_text(lemma)
-- case of coincidental echo word
if lemma2 ~= nil then
if lemma == nil or lemma == "" then
error("Argument l2 can't be used if argument 1 is not also provided")
end
text = text.."<br>The identification of the echo syllable with "..frame:expandTemplate{ title = "m", args = {"km", lemma2, nil, gloss = gloss2, sc = "Khmr"}}.. " is coincidental, and no meaning is carried over from it."
end
if nocap == nil then
text = mw.getContentLanguage():ucfirst(text)
end
-- Categories
if namespace == "" then
text = text.."[[Category:Khmer " .. mode_text.. "s]]"
if lemma == nil or lemma == "" then
-- Note: not always found in synonymous compounds, eg. ធីងធៅង from onomatopoeic source, or ពីងពាង from mkh-pro
--text = text.."[[Category:Khmer reduplicative synonymous compounds]]"
end
end
return text
end
return export