Jump to content

Module:km-etym

From Wiktionary, the free dictionary

Todo: move most of this to a "Khmer derivation" page, and keep just important information for module usage.

Affixes

Overview

This part relies broadly on Jenner (1969) and Jenner & Pou (1980). But Jenner's theoretical position is totally untenable and won't be followed here. He basically assumes that all basic roots are CVC and that all CCVC words are in fact secondary developments by prefixation or infixation of a base CVC root. Moreover, he posits that all CVN- prefixes (ban-, kan-, lum-,...) must all derive semantically and morphologically first from the prefixation of C- onto the root then from the infixation of a syllabic nasal -aN-. This extreme theoritical stance will not be followed. Indeed, there are many cases where CVN+Root words share a meaning with Root, and none with C+Root (when it exists!), which in fact is a separate, unrelated root of its own.

Notation

Following a widespread convention, the vowel of the syllabic affixes is represented by the metaphoneme "V", and the nasal coda by "N".

Jenner also uses the metaphoneme "L", corresponding to /r/ or /l/ depending on the phonological context, but the alleged allophonic distribution between both phonemes is not convincing (for example, both /r(ɔ)-/ and /l-/ can appear in front of /b-/). Consequently Jenner won't be followed on this point.

Note that some prefixes' initial should have been notated with a metaphoneme too (like /bɑn-/ ~ /pʊən/ > theoritical notation BVN-) but this alternation being seen as a direct consequence of the choice in the vowel's series, this notation is generally not adopted.

Missing root

It often happens that the root of a derived word doesn't occur in the modern language and was lost to time. In this situation it is difficult to determine the meaning of the root and the function of the affix, if it ever had one (for instance, in kaŋhaəp "frog" = kVN "?" + haəp "?"). Sometimes though, more than one derived forms have survived, which allows to reconstruct an hypothetical root (for instance, "srɑlaɲ", "sɑmlaɲ -> *laañ "to love or be loved) ). A star should precede such roots.

Functions

Overview

This relies very broadly on Jenner (1969) 's nomenclature, although it's very often difficult to agree with how he decided to categorize many derived words, and how specific some functional categories appear to be. For these reasons, this module will stick to the following guidelines:

- it will keep only a few broader functional categories (for example, no need to differentiate factitive from causative, habitual from frequentative... Clearly, Khmer's affixes aren't as clear cut function-wise to call for that much detailing).

- since Khmer words can very easily slide from noun to adj to verb to adverb without any overt morphological change, functions should be named based on how affixes affects the *meaning* of the base root, not on how it makes it change parts of speech (so no function should be called "nominalization", "verbalizations", etc..).

- even functions can easily slide from one meaning to the next. For example, processive can easily endorse a resultative meaning, but the opposite is rarely true if at all. In this case, it should be categorized as processive. Also, just because a verb is telic (= has an inherent end) does not mean that its nominalization is resultative, it is still processive. Jenner is very often culprit of that. For example, "information" "announcement" are processive, not resultative.


Reduplications

Overview

This follows Ehrman (1972)'s classification. There are 4 types of reduplications:

- rhyming reduplications (2nd and 3rd segments are kept): CVC ?VC

- alliterative reduplications (1st segment is kept): CVC C??

- ablauting reduplications (1st and 3rd segments are kept): CVC C?C

- total reduplications (all segments are kept).


Cases where 1st and 2nd segments are kept are rare and will be classified as alliterative.


Non-dummy echo syllable

The reduplicated syllable (the "echo") can occupy the first or the second position, and is usually a dummy syllable void of meaning and not found as a standalone word elsewhere in the language. But sometimes, the echo syllable does exist as a standalone word, in which case it can one of 2 things:

- the identification with the standalone word is fortuitous and the echo syllable does not carry over its meaning (it is still a dummy syllable).

- the echo syllable is the standalone word and forms with the other syllable an "echo-tautological" compound (Renner 2005), called here a reduplicative synonymous compound, where both words are both close semantically and phonetically (like English tiny-weeny, moan and groan).

In the first case, the template should be linked to the coincidental word using parameters l2 and t2. In the second case, the template should come after a synonymous compound template {{compound|km|type=syn}}.


--[[
Read the documentation before any change.
]]

local export = {}
local gsub = mw.ustring.gsub
local sub = mw.ustring.sub
local match = mw.ustring.match
local namespace = mw.title.getCurrentTitle().nsText
	
-- 0 = redup, 1 = prefix, 2 = infix
local affixes = { ["R-"] = 0,
["p-"] = 1, ["t-"] = 1, ["c-"] = 1, ["k-"] = 1, 
["s-"] = 1, ["r-"] = 1, ["l-"] = 1, ["m-"] = 1,
--["RrV-"] = 1,
["prV-"] = 1, ["trV-"] = 1, ["crV-"] = 1,
["krV-"] = 1, ["srV-"] = 1, 
--["RVN-"] = 1,
["bVN-"] = 1, ["dVN-"] = 1, ["cVN-"] = 1,
["kVN-"] = 1, ["sVN-"] = 1, ["rVN-"] = 1,
["lVN-"] = 1, ["aN-"] = 1,
["-b-"] = 2, ["-m-"] = 2, ["-n-"] = 2,
["-r-"] = 2, ["-l-"] = 2, ["-h-"] = 2,
-- The following two are allomorphs, should be merged?
["-VN-"] = 2, ["-Vmn-"] = 2, 
-- The following two are allomorphs of -n- and -m-, keeping them separate for now
["-rVn-"] = 2, ["-rVm-"] = 2} 

-- Not used
local affixes_km = { ["R-"] = 0,
["ប-"] = 1, ["ត-"] = 1, ["ច-"] = 1, ["ក-"] = 1, 
["ស-"] = 1, ["រ-"] = 1, ["ល-"] = 1, ["ម-"] = 1,
["ប្រ-"] = 1, ["ត្រ-"] = 1, ["ច្រ-"] = 1,
["ក្រ-"] = 1, ["ស្រ-"] = 1, ["បំ-"] = 1,
["ដំ-"] = 1, ["ចំ-"] = 1, ["កំ-"] = 1,
["សំ-"] = 1, ["រំ-"] = 1, ["លំ-"] = 1, 
["-ប-"] = 2, ["-ម-"] = 2, ["-ន-"] = 2,
["-រ-"] = 2, ["-ល-"] = 2, ["-◌ំ-"] = 2,
["-◌ំន-"] = 2, ["-រន-"] = 2} 
--

--[[ Important:
# Since Khmer words can very easily slide from noun to adj to verb to adverb without any overt morphological change, functions should be named based on how affixes affects the *meaning* of the base root, not on how it makes it change parts of speech. 
# As a consequence, do NOT create functions called "nominalization", "verbalizations", etc...
]]
local functions ={
-- Valency modifiers, verbalizations
   ["caus"]  = "causative",     -- factitive, transitivizing
   ["recip"] = "reciprocal", 
   ["attrib"]= "attributive",   -- qualitative, adjectival
-- Semantic shifts
   ["spec"]  = "specializing",  -- similative, figurative, directional,
                                -- sometimes no particular meaning added
                                -- more of a catch-all category
   ["intens"]= "intensive",     -- emphatic, directional, augmentative
   ["rep"]   = "repetitive",    -- frequentative, habitual, distributive
-- Nominalizations
   ["agent"] = "agentive", 
   ["instr"] = "instrumental", 
   ["loc"]   = "locative", 
   ["proc"]  = "processive",    -- action noun, gerund, sometimes resultative too
   ["res"]   = "resultative",
-- Numbers
   ["sing"]  = "singularizing", 
   ["coll"]  = "collective" 
} 

-- redup
local redup_modes = {
["rhy"] = "rhyming", ["rhyme"] = "rhyming",
["allit"] = "alliterative",
["ablaut"] = "ablauting"
--[nil]="", --used for total reduplication
} 

function get_redup_mode(mode)
    if mode == nil or mode == "" then 
        return "reduplication" 
    end
    
    mode_text = redup_modes[mode]
    if mode_text == nil then
       error("Reduplication mode could not be recognized")
    end

    mode_text = mode_text.. " reduplication"
    return mode_text
end

function get_base_only_text(root) 
    text="" 
    if root ~= nil and root ~= "" and root ~= "-" and sub(root, 1,1) == "-" then
        text =" This base is only found in derived forms."
    end
    return text
end  


function export.affix(frame)
    local args = frame:getParent().args

	root = args[1]
    gloss = args[2]
	affix = args[3] or args["af"]
	func = args[4]
	ultim = args["ultim"]
    compound = args["compound"]
    nocap =args["nocap"]
    nopunct =args["nopunct"]
    tr = args["tr"]
	affix2 = args["af2"]

	root_text = ""
	if root == "-" then
	   root_text = "form"
	elseif root ~= nil and root ~= "" then
	   root_text = "from " .. frame:expandTemplate{ title = "m", args = {"km", root, nil, gloss = gloss, sc = "Khmr", tr = tr}}
    end

    affix_text = ""
    affix_cat_text = ""
	affix_type = affixes[affix] or error("Affix could not be recognized")
	if affix_type == 0 then
       affix_text = "reduplicated initial"
       affix_cat_text = affix_text
	else
       if affix_type == 1 then
          affix_text = "prefixed" 
       elseif affix_type == 2 then
          affix_text = "infixed"
       end 
       affix_cat_text = affix_text .. " ".. affix
       affix_text = affix_text.. " ''[[".. affix.. "]]''" 
    end

    -- make a function...
    affix2_text = ""
    affix2_cat_text = ""
    if affix2 ~= nil then
     affix2_type = affixes[affix2] or error("Affix2 could not be recognized")
	 if affix2_type == 0 then
       affix2_text = "reduplicated initial"
       affix2_cat_text = affix2_text
	 else
       if affix2_type == 1 then
          affix2_text = "prefixed" 
       elseif affix2_type == 2 then
          affix2_text = "infixed"
       end 
       affix2_cat_text = affix2_text .. " ".. affix2
       affix2_text = affix2_text.. " ''[[".. affix2.. "]]''"
     end
     affix2_text = " and ".. affix2_text
    end

    func_string = ""
    func_text = ""
    if func ~= nil then
        func_string = functions[func] or error("Function could not be recognized")
    end
    if func_string ~= "" then
        func_text = " (" .. func_string .. ")"
    end
    
    compound_text = ""
    redup_text =""
    if compound ~= nil then
        compound_text = ", applied to each term of the"
        if compound=="1" then
           compound_text = compound_text.. " compound"
        else
           redup_text= get_redup_mode(compound)
           compound_text = compound_text.. " ".. redup_text 
        end
    end

    ultim_text = "" 
    if ultim ~= nil then
       ultim_text = ", ultimately from ".. frame:expandTemplate{ title = "m", args = {"km", ultim, nil, gloss = nil, sc = "Khmr"}}
    end

    if root_text ~= "" then
        text = root_text
        text = text .. " with "
        text = text .. affix_text.. func_text
        text = text .. affix2_text
        text = text .. compound_text
        text = text .. ultim_text

        if nopunct == nil then
           text = text .. "."
        end

        text = text.. get_base_only_text(root)

        if nocap == nil then
    	    text = mw.  getContentLanguage():ucfirst(text)
         end

    else 
        text="''[[".. affix.. "]]''"
    end


-- Categories
  if namespace == "" then 
    -- Will activate affix categories later
    -- text = text.."[[Category:Khmer terms with ".. affix_cat_text.."]]" 

    if affix2 ~= nil then
        -- text = text.."[[Category:Khmer terms with ".. affix_cat_text.."]]" 
    end
    
    -- will activate function categories later
    if func_string ~= "" then
        -- text = text.."[[Category:Khmer terms with ".. func_string.." affix]]" 
    end

    if compound ~= nil then
    	if compound == "1" then
    	   text = text.."[[Category:Khmer compound terms]]" 
    	elseif redup_text ~= nil then
    	   text = text.. "[[Category:Khmer " ..  redup_text.. "s]]"
    	end
        --text = text.."[[Category:Khmer compounds with duplicated affix]]" 
    end
  end
  return text
end


function export.redup(frame)
    local args = frame:getParent().args

	lemma= args[1]
    gloss= args[2]
    mode = args[3] or args["type"]
    lemma2 = args["l2"]
    gloss2 = args["t2"]
    tr = args["tr"]
    nocap =args["nocap"]

    mode_text = get_redup_mode(mode)
   
    article = "a"
    if sub(mode_text, 1, 1) == "a" then
       article = "an" 
    end

    text = "" 

    -- when lemma isn't provided, it is assumed it comes after a {{compound|km|type=syn}} template, that this module further qualifies.

	if lemma == nil or lemma == "" then
       text = mode_text
       nocap = 1
    else
	    text = mode_text.. " of ".. frame:expandTemplate{ title = "m", args = {"km", lemma, nil, gloss = gloss, sc = "Khmr", tr = tr}} .. "."
    end
    
    --
    text = text.. get_base_only_text(lemma)

    -- case of coincidental echo word
    if lemma2 ~= nil then 
        if lemma == nil or lemma == "" then
            error("Argument l2 can't be used if argument 1 is not also provided")
        end

        text = text.."<br>The identification of the echo syllable with "..frame:expandTemplate{ title = "m", args = {"km", lemma2, nil, gloss = gloss2, sc = "Khmr"}}.. " is coincidental, and no meaning is carried over from it."
    end        
   
    if nocap == nil then
        text = mw.getContentLanguage():ucfirst(text)
    end

-- Categories
    if namespace == "" then
       text = text.."[[Category:Khmer " .. mode_text.. "s]]" 
     
       if lemma == nil or lemma == "" then
          -- Note: not always found in synonymous compounds, eg. ធីងធៅង from onomatopoeic source, or ពីងពាង from mkh-pro
          --text = text.."[[Category:Khmer reduplicative synonymous compounds]]" 
       end
    end
    return text
end

return export