User:This, that and the other/EL headers
Header statistics from October 2024 dump
[edit]Excludes misspelled headers ("Adective", "Nonu", ...), headers that only appear once, and certain explicitly disallowed headers where the disposition is obvious (e.g. uses of the "Noun form" POS header are clearly errors for "Noun").
name | Total occurrences | EL status | Top 10 languages |
---|---|---|---|
Noun | 3639301 | OK | English (751453), Russian (209515), Swedish (201726), Finnish (165376), Spanish (122932), German (121521), Italian (118358), French (86821), Chinese (83269), Latin (78241), Dutch (69851), and more |
Verb | 3245280 | OK | Spanish (544871), Latin (477975), Italian (358216), Portuguese (259609), French (216506), English (188298), Galician (177530), Russian (146223), Catalan (132822), German (73155), Arabic (47292), and more |
Adjective | 1124404 | OK | English (172024), German (124723), Latin (98415), Italian (74346), Spanish (72332), French (45596), Esperanto (39614), Latvian (38397), Portuguese (36229), Swedish (32450), Finnish (23892), and more |
Participle | 514167 | OK | Latin (196269), Russian (50525), Latvian (43877), Italian (41324), Spanish (30200), French (28711), Bulgarian (17907), Portuguese (17187), German (12283), Catalan (11010), Swedish (8615), and more |
Proper noun | 502299 | OK | English (163048), Polish (28237), Translingual (22369), Romanian (21037), Chinese (18006), Japanese (16609), Italian (15102), Latin (13940), Dutch (12100), Finnish (10341), Portuguese (9841), and more |
Adverb | 137744 | OK | English (25774), Esperanto (8717), Finnish (7165), Italian (5146), Spanish (4682), Serbo-Croatian (4604), French (4477), Polish (3921), Macedonian (3562), Chinese (3088), Latvian (2937), and more |
Romanization | 113588 | OK | Mandarin (67761), Japanese (26945), Gothic (11266), Cantonese (2115), Javanese (1751), Sumerian (1150), Egyptian (610), Balinese (584), Sundanese (502), Manchu (336), Vietnamese (203), and more |
Han character | 44832 | OK | Translingual (37162), Vietnamese (7542), Tày (128) |
Pronoun | 30684 | OK | Hungarian (1717), Finnish (994), English (862), Middle English (824), Russian (665), Latin (656), Chinese (568), Ancient Greek (437), Irish (435), Polish (435), Pali (428), and more |
Numeral | 28778 | OK | Latin (1101), Finnish (729), Russian (705), Polish (669), German (421), Hungarian (415), English (374), Malay (348), Cebuano (288), Greek (267), Tagalog (250), and more |
Definitions | 28732 | Unapproved | Chinese (27309), Tangut (1036), Japanese (352), Korean (15), Khitan (6), Vietnamese (4), Hokkien (3), Proto-Sino-Tibetan (2), Amharic (1), Central Dusun (1), Faliscan (1), and more |
Interjection | 20343 | OK | English (4251), French (693), Portuguese (669), Polish (659), Japanese (613), Finnish (607), Swedish (572), Chinese (553), Spanish (496), Russian (494), Dutch (438), and more |
Suffix | 20082 | OK | Latin (1378), English (1270), Hungarian (877), Spanish (657), Japanese (585), Finnish (574), Italian (560), Middle English (549), Korean (508), Old English (469), Polish (417), and more |
Letter | 19425 | OK | Translingual (4136), Osage (198), Telugu (176), Carrier (175), English (171), Abkhaz (164), Tagalog (155), Vietnamese (145), Juǀ'hoan (131), Tlingit (130), Serbo-Croatian (124), and more |
Phrase | 15671 | OK | English (4441), Finnish (798), Chinese (766), Spanish (647), French (553), German (458), Japanese (425), Portuguese (343), Russian (331), Dutch (310), Polish (262), and more |
Kanji | 14355 | OK | Japanese (13937), Okinawan (173), Miyako (51), Kunigami (45), Yonaguni (45), Yaeyama (38), Yoron (15), Okinoerabu (13), Kikai (11), Northern Amami Ōshima (11), Southern Amami Ōshima (9), and more |
Prefix | 11351 | OK | English (2159), Italian (551), French (341), Irish (327), Spanish (285), Greek (282), Polish (281), Finnish (267), German (246), Japanese (241), Portuguese (230), and more |
Preposition | 10301 | OK | English (814), French (307), Polish (278), Spanish (234), Middle English (217), Serbo-Croatian (202), Norwegian Nynorsk (199), German (180), Irish (174), Italian (171), Hebrew (169), and more |
Hanja | 9806 | OK | Korean (9806) |
Idiom | 8915 | OK | Chinese (8376), Japanese (230), Vietnamese (102), Russian (29), Korean (23), Swedish (16), Serbo-Croatian (12), Icelandic (11), Turkish (11), Dutch (10), Malayalam (10), and more |
Symbol | 8657 | OK | Translingual (7362), English (424), Egyptian (187), Japanese (76), Chinese (67), Swedish (53), Slovene (50), Undetermined (47), German (36), Arabic (33), Khmer (32), and more |
Conjunction | 8085 | OK | Chinese (362), English (349), Italian (229), Polish (207), French (152), Japanese (146), Middle English (138), Spanish (127), Serbo-Croatian (121), Old Polish (117), Latin (109), and more |
Determiner | 6866 | OK | Middle English (403), English (304), Russian (222), Gothic (182), Ancient Greek (142), Spanish (131), Ukrainian (130), Korean (128), Norwegian Nynorsk (128), Bulgarian (124), Romanian (123), and more |
Proverb | 6463 | OK | English (1423), Chinese (743), Finnish (605), French (293), Polish (266), Portuguese (266), German (234), Spanish (226), Japanese (204), Telugu (170), Czech (165), and more |
Particle | 5070 | OK | Polish (652), Japanese (193), Pali (152), Burmese (138), Korean (132), Irish (113), Vietnamese (102), Russian (82), English (80), Chinese (74), Old Polish (73), and more |
Hanzi | 4652 | OK | Mandarin (2631), Cantonese (2020), Chinese (1) |
Prepositional phrase | 4598 | OK | English (2776), Italian (597), Dutch (227), French (215), German (116), Norwegian Nynorsk (90), Welsh (85), Swedish (74), Irish (38), Greek (32), Romanian (30), and more |
Root | 4278 | OK | Arabic (1399), Proto-Indo-European (775), Sanskrit (751), Hebrew (281), Navajo (208), Korean (163), Assyrian Neo-Aramaic (120), Proto-Kartvelian (75), Murui Huitoto (59), South Levantine Arabic (56), Proto-Georgian-Zan (42), and more |
Contraction | 2652 | OK | English (847), Romanian (230), Portuguese (191), Italian (94), Yola (93), German (92), French (86), Galician (74), Irish (68), Middle Dutch (62), Asturian (54), and more |
Postposition | 2167 | OK | Finnish (260), Hungarian (119), Navajo (97), Bengali (89), Ye'kwana (80), Ingrian (79), Azerbaijani (73), Hindi (72), Armenian (48), Estonian (48), Votic (44), and more |
Syllable | 1417 | OK | Japanese (787), Korean (367), Vai (260), Ainu (2), Japanese Sign Language (1) |
Article | 1084 | OK | Old Irish (35), German (34), Greek (32), Ancient Greek (29), English (25), Volapük (24), Maltese (20), Romanian (19), Italian (18), Catalan (17), Dutch (17), and more |
Classifier | 966 | OK | Chinese (157), Thai (109), Zhuang (109), Vietnamese (88), Assamese (66), Khmer (57), Burmese (52), Murui Huitoto (32), Garo (27), S'gaw Karen (25), Indonesian (24), and more |
Cuneiform sign | 798 | Unapproved | Translingual (798) |
Affix | 746 | OK | Japanese (595), Garo (54), Telugu (29), Vietnamese (22), Blackfoot (14), Fula (7), Ido (7), Korean (6), Dhivehi (2), Sanskrit (2), English (1), and more |
Ideophone | 732 | OK | Korean (459), Hausa (136), Yoruba (77), Ye'kwana (16), Basque (10), Xhosa (7), Jeju (5), Middle Korean (4), Nupe (3), Zulu (3), Chichewa (2), and more |
Ligature | 677 | OK | Translingual (352), Telugu (235), Malayalam (61), Dhivehi (10), Tagalog (4), Gujarati (3), Hindi (3), Nepali (3), Kapampangan (2), Marathi (2), Sanskrit (1), and more |
Diacritical mark | 555 | OK | Translingual (253), Old Norse (18), Kannada (17), Khmer (16), English (15), Ancient Greek (14), Hebrew (12), Latin (12), Punjabi (12), Lepcha (10), Malayalam (10), and more |
Punctuation mark | 522 | OK | Translingual (214), English (44), Japanese (36), Chinese (24), Korean (20), Arabic (11), French (11), Armenian (9), Mongolian (8), N'Ko (8), Sylheti (7), and more |
Verbal noun | 459 | Arguably disallowed | Georgian (439), Old Georgian (7), Mingrelian (6), Aghwan (3), Laz (3), Persian (1) |
Logogram | 444 | OK | Akkadian (437), Hittite (6), Luwian (1) |
Number | 431 | OK | Translingual (85), English (59), Minoan (45), Mycenaean Greek (45), Romansch (35), Japanese (16), Thai (12), Kaurna (11), Tillamook (11), Gujarati (10), Kabuverdianu (10), and more |
Predicative | 374 | Unapproved | Russian (168), Ukrainian (93), Belarusian (20), Azerbaijani (18), Bashkir (17), Kazakh (6), Khalaj (6), Bulgarian (4), Lower Sorbian (4), Yakut (4), Macanese (3), and more |
Relative | 320 | Unapproved | Xhosa (186), Zulu (43), Swazi (34), Southern Ndebele (28), Phuthi (16), Northern Ndebele (6), Sotho (5), Lala (South Africa) (1), Proto-Nguni (1) |
Counter | 303 | OK | Japanese (195), Korean (80), Okinawan (9), Jeju (5), Uyghur (4), Nuosu (3), Middle Korean (2), Nepali (2), Yucatec Maya (2), Classical Nahuatl (1) |
Reconstruction | 200 | Unapproved | Proto-Uralic (48), Proto-Indo-European (40), Proto-Mongolic (20), Proto-Japonic (18), Proto-Finnic (10), Tocharian B (10), Proto-Turkic (8), Proto-Georgian-Zan (7), Proto-Kartvelian (6), Proto-Ryukyuan (6), Proto-Slavic (6), and more |
Interfix | 191 | OK | English (35), Finnish (15), German (10), Norwegian Nynorsk (8), Dutch (6), French (6), Hungarian (6), Polish (5), Swedish (5), Navajo (4), Portuguese (4), and more |
Combining form | 159 | OK | Russian (83), Japanese (32), Ojibwe (12), Ancient Greek (11), Ukrainian (8), Vietnamese (6), Miyako (3), English (2), Estonian (2) |
Infix | 156 | OK | Swahili (48), English (47), Chichewa (8), Malay (5), Indonesian (4), French (3), Chinese (2), Hawaiian (2), Old Javanese (2), Portuguese (2), Serbo-Croatian (2), and more |
Preverb | 129 | Unapproved | Ojibwe (50), Georgian (23), Laz (11), Munsee (11), Unami (11), Ottawa (7), Aghwan (4), Wiyot (3), Chickasaw (2), Warlpiri (2), Eastern Ojibwa (1), and more |
Circumfix | 124 | OK | Tagalog (30), Indonesian (22), Georgian (13), Dutch (9), Ye'kwana (8), German (5), Malay (5), English (4), Guaraní (4), Kangean (3), Laz (3), and more |
Transliteration | 123 | Unapproved | Hittite (123) |
Adnominal | 108 | Unapproved | Japanese (90), Old Japanese (17), Okinawan (1) |
Final | 101 | Unapproved | Ojibwe (87), Ottawa (11), Blackfoot (2), Munsee (1) |
Stem | 94 | Unapproved | Navajo (90), Tuvan (3), Indonesian (1) |
Phoneme | 79 | Unapproved | Northern Qiandong Miao (40), Chinese (39) |
Initial | 78 | Unapproved | Ojibwe (68), Ottawa (10) |
Chữ Hán | 61 | Unapproved - should be "Han character" | Vietnamese (61) |
Dependent noun | 48 | Arguably disallowed | Korean (40), Malecite-Passamaquoddy (3), Middle Korean (2), Burmese (1), Early Modern Korean (1), Jeju (1) |
Circumposition | 36 | OK | Northern Kurdish (16), Dutch (11), Chinese (3), German (2), Pashto (2), Central Kurdish (1), Danish (1) |
Component | 32 | Unapproved | Tangut (32) |
Determinative | 27 | OK | Akkadian (18), Sumerian (9) |
Verb Root | 25 | Unapproved | Unami (15), Cebuano (10) |
Iteration mark | 24 | Unapproved | Japanese (11), Chinese (3), Egyptian (1), Indonesian (1), Khmer (1), Lao (1), Malay (1), Nuosu (1), Tagalog (1), Tangut (1), Thai (1), and more |
Medial | 24 | Unapproved | Ojibwe (23), Blackfoot (1) |
Clitic | 23 | Explicitly disallowed | Afar (7), Mongolian (7), Proto-Dravidian (2), Amharic (1), Czech (1), Finnish (1), Mohawk (1), Onondaga (1), Swedish (1), Tzotzil (1) |
Phonogram | 22 | Unapproved | Old Japanese (11), Old Korean (11) |
Compound part | 16 | Unapproved | Vietnamese (16) |
Prenoun | 16 | Unapproved | Munsee (9), Ottawa (4), Menominee (1), Ojibwe (1), Unami (1) |
Enclitic | 12 | Arguably disallowed - "Clitic" | Greenlandic (8), Makasar (3), Marshallese (1) |
Demonstrative | 10 | Unapproved | Mokilese (9), Pa'o Karen (1) |
Adjective form | 9 | Explicitly disallowed - "(POS) form" | Korean (5), Pali (3), Japanese (1) |
Composition | 8 | Unapproved | Translingual (8) |
Multiple parts of speech | 8 | Unapproved | English (8) |
Past participle | 7 | Explicitly disallowed - "(attribute) (POS)" | Fala (5), Assyrian Neo-Aramaic (2) |
Ambiposition | 6 | OK | Northern Sami (4), Kildin Sami (1), Sanskrit (1) |
Noun Root | 6 | Unapproved | Unami (6) |
Word | 6 | Unapproved | Lithuanian (5), Celtiberian (1) |
Arabicization | 3 | Unapproved | Javanese (3) |
Misspelling | 3 | Unapproved | Finnish (1), German (1), Portuguese (1) |
Abbreviation | 2 | Explicitly disallowed | Dutch (1), Ottoman Turkish (1) |
Adverbial phrase | 2 | Explicitly disallowed - "(attribute) (POS)" | Galician (1), Yakut (1) |
Gender classifier | 2 | Explicitly disallowed - "(attribute) (POS)" | Hiw (2) |
Interrogative pronoun | 2 | Explicitly disallowed - "(attribute) (POS)" | Ahom (1), Shan (1) |
Modifier | 2 | Unapproved | Maori (2) |
Nominal | 2 | Unapproved | Nhanda (2) |
Onomatopoeia | 2 | Unapproved | Korean (1), Yakut (1) |
Simulfix | 2 | Unapproved | Old Irish (1), Translingual (1) |
Issues needing sorting out
[edit]- Definitions header is not approved. Clearly it should be. It has long been used in Chinese entries.
- Is Combining form required? These all look able to be classified as prefixes or suffixes.
- Most combining forms are Russian, where they really look like verb-forming suffixes. Need to find out what term is used in the literature for these
- See Cat:Combining forms by language (which is curiously missing the Russian ones)
- Cuneiform sign - Letter and Symbol don't fit, so I think we should explicitly allow it, just like we allow Han character.
- Or is this supposed to be under Logogram? I'm not sure that I understand the difference. Note that we don't use Ideogram for Han characters.
- Verbal noun should be disallowed. This is currently used in Kartvelian languages. However, Noun is sufficient, considering that these entries already are shown to be verbal nouns via the form-of template
{{verbal noun of}}
. Compare Cat:Arabic verbal nouns, which use the Noun POS. - Predicative and Relative; Stem; Phoneme - to check
- Transliteration - well, these are all Romanizations currently. But then we have Arabicisation appearing for a handful of Javanese entries...
- A lot of languages - Algonquian and Kartvelian and probably others - seem to want Preverb. Maybe needs to be added.
- Special POS used by certain languages:
- Cat:Algonquian languages: Initial, Final, Medial, Prenoun
- Cat:Japonic languages: Adnominal
Draft
[edit]Part of speech
[edit]The part of speech (POS) is a descriptor like “Noun” or “Adjective” that defines the class of every term, phrase, symbol, morpheme and other lexical unit. [Because Wiktionary covers all the world's languages, it is necessary to construe the term "part of speech" somewhat more broadly than the traditional interpretation.]
Each entry has one or more POS sections. In each, there is a headword line, followed by the definitions themselves.[1]
[Only POS headers approved by the community may be used.]
The following POS headers are approved for use in any entry (although not all of these headers are applicable to every language):
- Parts of speech: Adjective, Adverb, Ambiposition, Article, Circumposition, Classifier, Conjunction, Contraction, Counter, Determiner, Ideophone, Interjection, Noun, Numeral, Participle, Particle, Postposition, Preposition, Pronoun, Proper noun, Verb
- Morphemes: Circumfix, Combining form, Infix, Interfix, Prefix, Root, Suffix
- Symbols and characters: Diacritical mark, Letter, Ligature, Number, Punctuation mark, Syllable, Symbol
- Phrases: Phrase, Prepositional phrase[2], Proverb
- Han characters: Hanzi (Sinitic languages), Kanji (Japanese), Hanja (Korean), Han character (all others including Translingual)
- Romanization
- Cuneiform-specific: Logogram, Determinative
The Definitions POS header is allowed in Chinese and Tangut entries. Terms in these languages are often highly polysemic and there is no inflection information to provide, so a strict distinction between parts of speech is counterproductive. The Definitions header may also be used in exceptional circumstances where the POS is unknown (for instance, terms whose meaning is highly uncertain).
Other headers can be proposed as new additions to the list. The use of nonstandard POS headers may cause an entry to be categorized in a cleanup category for further inspection.
The following POS headers are explicitly disallowed:
- Abbreviation, Acronym, Initialism – use the POS header(s) corresponding to the function of the term (for instance, English ATM uses the "Noun" POS header)
- Clitic
- Gerund
- Idiom
- “(POS) form”: Verb form, Noun form, etc. – these POSs are allowed and routinely used as a parameter to the
{{head}}
template, but are not allowed as headers - “(POS) phrase”: Noun phrase, Verb phrase, etc. (with the exception of Prepositional phrase)
- “(attribute) (POS)”: Transitive verb, Personal pronoun, Verbal noun, etc. (with the exception of Proper noun)
- “(POS) (number)”: Noun 1, Noun 2, etc.
- Cardinal number, Ordinal number, Cardinal numeral, Ordinal numeral
- Ordinal numbers like first are classified as adjectives, placed in Category:English ordinal numbers by a template. Fractions like seven eighths are nouns, and are added to Category:English fractional numbers by a template.