Jump to content

User:This, that and the other/EL headers

From Wiktionary, the free dictionary

Header statistics from October 2024 dump

[edit]

Excludes misspelled headers ("Adective", "Nonu", ...), headers that only appear once, and certain explicitly disallowed headers where the disposition is obvious (e.g. uses of the "Noun form" POS header are clearly errors for "Noun").

name Total occurrences EL status Top 10 languages
Noun 3639301 OK English (751453), Russian (209515), Swedish (201726), Finnish (165376), Spanish (122932), German (121521), Italian (118358), French (86821), Chinese (83269), Latin (78241), Dutch (69851), and more
Verb 3245280 OK Spanish (544871), Latin (477975), Italian (358216), Portuguese (259609), French (216506), English (188298), Galician (177530), Russian (146223), Catalan (132822), German (73155), Arabic (47292), and more
Adjective 1124404 OK English (172024), German (124723), Latin (98415), Italian (74346), Spanish (72332), French (45596), Esperanto (39614), Latvian (38397), Portuguese (36229), Swedish (32450), Finnish (23892), and more
Participle 514167 OK Latin (196269), Russian (50525), Latvian (43877), Italian (41324), Spanish (30200), French (28711), Bulgarian (17907), Portuguese (17187), German (12283), Catalan (11010), Swedish (8615), and more
Proper noun 502299 OK English (163048), Polish (28237), Translingual (22369), Romanian (21037), Chinese (18006), Japanese (16609), Italian (15102), Latin (13940), Dutch (12100), Finnish (10341), Portuguese (9841), and more
Adverb 137744 OK English (25774), Esperanto (8717), Finnish (7165), Italian (5146), Spanish (4682), Serbo-Croatian (4604), French (4477), Polish (3921), Macedonian (3562), Chinese (3088), Latvian (2937), and more
Romanization 113588 OK Mandarin (67761), Japanese (26945), Gothic (11266), Cantonese (2115), Javanese (1751), Sumerian (1150), Egyptian (610), Balinese (584), Sundanese (502), Manchu (336), Vietnamese (203), and more
Han character 44832 OK Translingual (37162), Vietnamese (7542), Tày (128)
Pronoun 30684 OK Hungarian (1717), Finnish (994), English (862), Middle English (824), Russian (665), Latin (656), Chinese (568), Ancient Greek (437), Irish (435), Polish (435), Pali (428), and more
Numeral 28778 OK Latin (1101), Finnish (729), Russian (705), Polish (669), German (421), Hungarian (415), English (374), Malay (348), Cebuano (288), Greek (267), Tagalog (250), and more
Definitions 28732 Unapproved Chinese (27309), Tangut (1036), Japanese (352), Korean (15), Khitan (6), Vietnamese (4), Hokkien (3), Proto-Sino-Tibetan (2), Amharic (1), Central Dusun (1), Faliscan (1), and more
Interjection 20343 OK English (4251), French (693), Portuguese (669), Polish (659), Japanese (613), Finnish (607), Swedish (572), Chinese (553), Spanish (496), Russian (494), Dutch (438), and more
Suffix 20082 OK Latin (1378), English (1270), Hungarian (877), Spanish (657), Japanese (585), Finnish (574), Italian (560), Middle English (549), Korean (508), Old English (469), Polish (417), and more
Letter 19425 OK Translingual (4136), Osage (198), Telugu (176), Carrier (175), English (171), Abkhaz (164), Tagalog (155), Vietnamese (145), Juǀ'hoan (131), Tlingit (130), Serbo-Croatian (124), and more
Phrase 15671 OK English (4441), Finnish (798), Chinese (766), Spanish (647), French (553), German (458), Japanese (425), Portuguese (343), Russian (331), Dutch (310), Polish (262), and more
Kanji 14355 OK Japanese (13937), Okinawan (173), Miyako (51), Kunigami (45), Yonaguni (45), Yaeyama (38), Yoron (15), Okinoerabu (13), Kikai (11), Northern Amami Ōshima (11), Southern Amami Ōshima (9), and more
Prefix 11351 OK English (2159), Italian (551), French (341), Irish (327), Spanish (285), Greek (282), Polish (281), Finnish (267), German (246), Japanese (241), Portuguese (230), and more
Preposition 10301 OK English (814), French (307), Polish (278), Spanish (234), Middle English (217), Serbo-Croatian (202), Norwegian Nynorsk (199), German (180), Irish (174), Italian (171), Hebrew (169), and more
Hanja 9806 OK Korean (9806)
Idiom 8915 OK Chinese (8376), Japanese (230), Vietnamese (102), Russian (29), Korean (23), Swedish (16), Serbo-Croatian (12), Icelandic (11), Turkish (11), Dutch (10), Malayalam (10), and more
Symbol 8657 OK Translingual (7362), English (424), Egyptian (187), Japanese (76), Chinese (67), Swedish (53), Slovene (50), Undetermined (47), German (36), Arabic (33), Khmer (32), and more
Conjunction 8085 OK Chinese (362), English (349), Italian (229), Polish (207), French (152), Japanese (146), Middle English (138), Spanish (127), Serbo-Croatian (121), Old Polish (117), Latin (109), and more
Determiner 6866 OK Middle English (403), English (304), Russian (222), Gothic (182), Ancient Greek (142), Spanish (131), Ukrainian (130), Korean (128), Norwegian Nynorsk (128), Bulgarian (124), Romanian (123), and more
Proverb 6463 OK English (1423), Chinese (743), Finnish (605), French (293), Polish (266), Portuguese (266), German (234), Spanish (226), Japanese (204), Telugu (170), Czech (165), and more
Particle 5070 OK Polish (652), Japanese (193), Pali (152), Burmese (138), Korean (132), Irish (113), Vietnamese (102), Russian (82), English (80), Chinese (74), Old Polish (73), and more
Hanzi 4652 OK Mandarin (2631), Cantonese (2020), Chinese (1)
Prepositional phrase 4598 OK English (2776), Italian (597), Dutch (227), French (215), German (116), Norwegian Nynorsk (90), Welsh (85), Swedish (74), Irish (38), Greek (32), Romanian (30), and more
Root 4278 OK Arabic (1399), Proto-Indo-European (775), Sanskrit (751), Hebrew (281), Navajo (208), Korean (163), Assyrian Neo-Aramaic (120), Proto-Kartvelian (75), Murui Huitoto (59), South Levantine Arabic (56), Proto-Georgian-Zan (42), and more
Contraction 2652 OK English (847), Romanian (230), Portuguese (191), Italian (94), Yola (93), German (92), French (86), Galician (74), Irish (68), Middle Dutch (62), Asturian (54), and more
Postposition 2167 OK Finnish (260), Hungarian (119), Navajo (97), Bengali (89), Ye'kwana (80), Ingrian (79), Azerbaijani (73), Hindi (72), Armenian (48), Estonian (48), Votic (44), and more
Syllable 1417 OK Japanese (787), Korean (367), Vai (260), Ainu (2), Japanese Sign Language (1)
Article 1084 OK Old Irish (35), German (34), Greek (32), Ancient Greek (29), English (25), Volapük (24), Maltese (20), Romanian (19), Italian (18), Catalan (17), Dutch (17), and more
Classifier 966 OK Chinese (157), Thai (109), Zhuang (109), Vietnamese (88), Assamese (66), Khmer (57), Burmese (52), Murui Huitoto (32), Garo (27), S'gaw Karen (25), Indonesian (24), and more
Cuneiform sign 798 Unapproved Translingual (798)
Affix 746 OK Japanese (595), Garo (54), Telugu (29), Vietnamese (22), Blackfoot (14), Fula (7), Ido (7), Korean (6), Dhivehi (2), Sanskrit (2), English (1), and more
Ideophone 732 OK Korean (459), Hausa (136), Yoruba (77), Ye'kwana (16), Basque (10), Xhosa (7), Jeju (5), Middle Korean (4), Nupe (3), Zulu (3), Chichewa (2), and more
Ligature 677 OK Translingual (352), Telugu (235), Malayalam (61), Dhivehi (10), Tagalog (4), Gujarati (3), Hindi (3), Nepali (3), Kapampangan (2), Marathi (2), Sanskrit (1), and more
Diacritical mark 555 OK Translingual (253), Old Norse (18), Kannada (17), Khmer (16), English (15), Ancient Greek (14), Hebrew (12), Latin (12), Punjabi (12), Lepcha (10), Malayalam (10), and more
Punctuation mark 522 OK Translingual (214), English (44), Japanese (36), Chinese (24), Korean (20), Arabic (11), French (11), Armenian (9), Mongolian (8), N'Ko (8), Sylheti (7), and more
Verbal noun 459 Arguably disallowed Georgian (439), Old Georgian (7), Mingrelian (6), Aghwan (3), Laz (3), Persian (1)
Logogram 444 OK Akkadian (437), Hittite (6), Luwian (1)
Number 431 OK Translingual (85), English (59), Minoan (45), Mycenaean Greek (45), Romansch (35), Japanese (16), Thai (12), Kaurna (11), Tillamook (11), Gujarati (10), Kabuverdianu (10), and more
Predicative 374 Unapproved Russian (168), Ukrainian (93), Belarusian (20), Azerbaijani (18), Bashkir (17), Kazakh (6), Khalaj (6), Bulgarian (4), Lower Sorbian (4), Yakut (4), Macanese (3), and more
Relative 320 Unapproved Xhosa (186), Zulu (43), Swazi (34), Southern Ndebele (28), Phuthi (16), Northern Ndebele (6), Sotho (5), Lala (South Africa) (1), Proto-Nguni (1)
Counter 303 OK Japanese (195), Korean (80), Okinawan (9), Jeju (5), Uyghur (4), Nuosu (3), Middle Korean (2), Nepali (2), Yucatec Maya (2), Classical Nahuatl (1)
Reconstruction 200 Unapproved Proto-Uralic (48), Proto-Indo-European (40), Proto-Mongolic (20), Proto-Japonic (18), Proto-Finnic (10), Tocharian B (10), Proto-Turkic (8), Proto-Georgian-Zan (7), Proto-Kartvelian (6), Proto-Ryukyuan (6), Proto-Slavic (6), and more
Interfix 191 OK English (35), Finnish (15), German (10), Norwegian Nynorsk (8), Dutch (6), French (6), Hungarian (6), Polish (5), Swedish (5), Navajo (4), Portuguese (4), and more
Combining form 159 OK Russian (83), Japanese (32), Ojibwe (12), Ancient Greek (11), Ukrainian (8), Vietnamese (6), Miyako (3), English (2), Estonian (2)
Infix 156 OK Swahili (48), English (47), Chichewa (8), Malay (5), Indonesian (4), French (3), Chinese (2), Hawaiian (2), Old Javanese (2), Portuguese (2), Serbo-Croatian (2), and more
Preverb 129 Unapproved Ojibwe (50), Georgian (23), Laz (11), Munsee (11), Unami (11), Ottawa (7), Aghwan (4), Wiyot (3), Chickasaw (2), Warlpiri (2), Eastern Ojibwa (1), and more
Circumfix 124 OK Tagalog (30), Indonesian (22), Georgian (13), Dutch (9), Ye'kwana (8), German (5), Malay (5), English (4), Guaraní (4), Kangean (3), Laz (3), and more
Transliteration 123 Unapproved Hittite (123)
Adnominal 108 Unapproved Japanese (90), Old Japanese (17), Okinawan (1)
Final 101 Unapproved Ojibwe (87), Ottawa (11), Blackfoot (2), Munsee (1)
Stem 94 Unapproved Navajo (90), Tuvan (3), Indonesian (1)
Phoneme 79 Unapproved Northern Qiandong Miao (40), Chinese (39)
Initial 78 Unapproved Ojibwe (68), Ottawa (10)
Chữ Hán 61 Unapproved - should be "Han character" Vietnamese (61)
Dependent noun 48 Arguably disallowed Korean (40), Malecite-Passamaquoddy (3), Middle Korean (2), Burmese (1), Early Modern Korean (1), Jeju (1)
Circumposition 36 OK Northern Kurdish (16), Dutch (11), Chinese (3), German (2), Pashto (2), Central Kurdish (1), Danish (1)
Component 32 Unapproved Tangut (32)
Determinative 27 OK Akkadian (18), Sumerian (9)
Verb Root 25 Unapproved Unami (15), Cebuano (10)
Iteration mark 24 Unapproved Japanese (11), Chinese (3), Egyptian (1), Indonesian (1), Khmer (1), Lao (1), Malay (1), Nuosu (1), Tagalog (1), Tangut (1), Thai (1), and more
Medial 24 Unapproved Ojibwe (23), Blackfoot (1)
Clitic 23 Explicitly disallowed Afar (7), Mongolian (7), Proto-Dravidian (2), Amharic (1), Czech (1), Finnish (1), Mohawk (1), Onondaga (1), Swedish (1), Tzotzil (1)
Phonogram 22 Unapproved Old Japanese (11), Old Korean (11)
Compound part 16 Unapproved Vietnamese (16)
Prenoun 16 Unapproved Munsee (9), Ottawa (4), Menominee (1), Ojibwe (1), Unami (1)
Enclitic 12 Arguably disallowed - "Clitic" Greenlandic (8), Makasar (3), Marshallese (1)
Demonstrative 10 Unapproved Mokilese (9), Pa'o Karen (1)
Adjective form 9 Explicitly disallowed - "(POS) form" Korean (5), Pali (3), Japanese (1)
Composition 8 Unapproved Translingual (8)
Multiple parts of speech 8 Unapproved English (8)
Past participle 7 Explicitly disallowed - "(attribute) (POS)" Fala (5), Assyrian Neo-Aramaic (2)
Ambiposition 6 OK Northern Sami (4), Kildin Sami (1), Sanskrit (1)
Noun Root 6 Unapproved Unami (6)
Word 6 Unapproved Lithuanian (5), Celtiberian (1)
Arabicization 3 Unapproved Javanese (3)
Misspelling 3 Unapproved Finnish (1), German (1), Portuguese (1)
Abbreviation 2 Explicitly disallowed Dutch (1), Ottoman Turkish (1)
Adverbial phrase 2 Explicitly disallowed - "(attribute) (POS)" Galician (1), Yakut (1)
Gender classifier 2 Explicitly disallowed - "(attribute) (POS)" Hiw (2)
Interrogative pronoun 2 Explicitly disallowed - "(attribute) (POS)" Ahom (1), Shan (1)
Modifier 2 Unapproved Maori (2)
Nominal 2 Unapproved Nhanda (2)
Onomatopoeia 2 Unapproved Korean (1), Yakut (1)
Simulfix 2 Unapproved Old Irish (1), Translingual (1)

Issues needing sorting out

[edit]
  • Definitions header is not approved. Clearly it should be. It has long been used in Chinese entries.
  • Is Combining form required? These all look able to be classified as prefixes or suffixes.
    • Most combining forms are Russian, where they really look like verb-forming suffixes. Need to find out what term is used in the literature for these
    • See Cat:Combining forms by language (which is curiously missing the Russian ones)
  • Cuneiform sign - Letter and Symbol don't fit, so I think we should explicitly allow it, just like we allow Han character.
    • Or is this supposed to be under Logogram? I'm not sure that I understand the difference. Note that we don't use Ideogram for Han characters.
  • Verbal noun should be disallowed. This is currently used in Kartvelian languages. However, Noun is sufficient, considering that these entries already are shown to be verbal nouns via the form-of template {{verbal noun of}}. Compare Cat:Arabic verbal nouns, which use the Noun POS.
  • Predicative and Relative; Stem; Phoneme - to check
  • Transliteration - well, these are all Romanizations currently. But then we have Arabicisation appearing for a handful of Javanese entries...
  • A lot of languages - Algonquian and Kartvelian and probably others - seem to want Preverb. Maybe needs to be added.
  • Special POS used by certain languages:

Draft

[edit]

Part of speech

[edit]

The part of speech (POS) is a descriptor like “Noun” or “Adjective” that defines the class of every term, phrase, symbol, morpheme and other lexical unit. [Because Wiktionary covers all the world's languages, it is necessary to construe the term "part of speech" somewhat more broadly than the traditional interpretation.]

Each entry has one or more POS sections. In each, there is a headword line, followed by the definitions themselves.[1]

[Only POS headers approved by the community may be used.]

The following POS headers are approved for use in any entry (although not all of these headers are applicable to every language):

  • Parts of speech: Adjective, Adverb, Ambiposition, Article, Circumposition, Classifier, Conjunction, Contraction, Counter, Determiner, Ideophone, Interjection, Noun, Numeral, Participle, Particle, Postposition, Preposition, Pronoun, Proper noun, Verb
  • Morphemes: Circumfix, Combining form, Infix, Interfix, Prefix, Root, Suffix
  • Symbols and characters: Diacritical mark, Letter, Ligature, Number, Punctuation mark, Syllable, Symbol
  • Phrases: Phrase, Prepositional phrase[2], Proverb
  • Han characters: Hanzi (Sinitic languages), Kanji (Japanese), Hanja (Korean), Han character (all others including Translingual)
  • Romanization
  • Cuneiform-specific: Logogram, Determinative

The Definitions POS header is allowed in Chinese and Tangut entries. Terms in these languages are often highly polysemic and there is no inflection information to provide, so a strict distinction between parts of speech is counterproductive. The Definitions header may also be used in exceptional circumstances where the POS is unknown (for instance, terms whose meaning is highly uncertain).

Other headers can be proposed as new additions to the list. The use of nonstandard POS headers may cause an entry to be categorized in a cleanup category for further inspection.

The following POS headers are explicitly disallowed:

  • Abbreviation, Acronym, Initialism – use the POS header(s) corresponding to the function of the term (for instance, English ATM uses the "Noun" POS header)
  • Clitic
  • Gerund
  • Idiom
  • “(POS) form”: Verb form, Noun form, etc. – these POSs are allowed and routinely used as a parameter to the {{head}} template, but are not allowed as headers
  • “(POS) phrase”: Noun phrase, Verb phrase, etc. (with the exception of Prepositional phrase)
  • “(attribute) (POS)”: Transitive verb, Personal pronoun, Verbal noun, etc. (with the exception of Proper noun)
  • “(POS) (number)”: Noun 1, Noun 2, etc.
  • Cardinal number, Ordinal number, Cardinal numeral, Ordinal numeral