Wiktionary:About Chinese/tasks
Jump to navigation
Jump to search
Below are some of the tasks pertinent to Chinese-language entries on Wiktionary.
Regular patrolling
[edit]- Recent changes to Chinese entries (can be customised – hide registered users, hide patrolled edits, etc.):
{{zh-forms}}
or{{zh-see}}
linking to nonexistent pages:
Entry maintenance
[edit]- Add the template
{{CJKV}}
where applicable. (Example) - Entries using
{{zh-see}}
to link to an entry without a Chinese section: - Entries using
{{zh-see}}
to link to another variant character not containing{{zh-pron}}
: - Entries using
{{zh-see}}
to link to entries without reciprocal mentioning in their{{zh-forms}}
: - Entries using
{{zh-forms}}
with gloss-less components:- Special:WhatLinksHere/Template:tracking/zh-forms/no gloss found for Chinese character (character)
- Special:WhatLinksHere/Template:tracking/zh-forms/no gloss found but entry exists (blue-linked multisyllabic)
- Special:WhatLinksHere/Template:tracking/zh-forms/no gloss found with a nonexistent entry (red-linked multisyllabic)
- Expansion and cleanup needed for character pages:
- Details at User:Wyang/char-summary and User:Justinrleung/char-summary
- Entries with missing senses:
- Entries with Pinyin different from CC-CEDICT and/or Dictionary by the Ministry of Education of Taiwan:
- Entries with missing Cantonese readings, sorted by importance:
- Compounds with Mandarin or Cantonese character readings not found in the individual character pages:
- Cantonese: User:Wyang/check-yue-pron, search
- Compounds (ABCD) not mentioned in derived terms on the component pages (AB and CD)
- Special:WhatLinksHere/Template:tracking/zh-forms/compounds not mentioned in derived terms on the component pages (for component with more than one character)
{{zh-pron}}
usage missing POS- Entries with multiple/problematic Hokkien readings lacking location labels
- Translations into Chinese to be checked (split by topolect) - Mandarin, Cantonese, etc.
- List the routinely consulted dictionaries under Wiktionary:About Chinese/references.
- Clearing Han character categories for individual lects.
- Characters with mc/oc paramater but no data to show (e.g. 砼, which are incorrectly categorized to Category:Middle Chinese lemmas and Category:Old Chinese lemmas)
- Invalid syllables in Module:zh/data/yue-word
- Characters with translingual definitions
- Entries lacking a
{{zh-dial}}
template - Characters with specific part of speech headers
- Entries using
{{zh-forms}}
with missing forms: - Compare Zhengzhang's reconstructions in the second edition of his book (2013) with those in the first edition (2003) and update the data modules (very minor differences from a quick look)
Done but need to be regularly checked
[edit]- Done (24 Jun 2018) Entries with zh-pron but no definition section (e.g. 蒦).
- Done (24 Jun 2018) "Derived terms" L3 heading fix, on entries without a second PoS
- Done (24 Jun 2018) Entries with
{{wikipedia}}
instead of{{zh-wp}}
Entry creation
[edit]- Entries from Modern Standard Chinese Dictionary (现代汉语规范词典) missing on Wiktionary:
- CEDICT entries missing on Wiktionary:
- Chinese entries to add, by Tooironic:
- Missing chengyu and suyu:
- Missing entries with multiple topolect readings:
- Requested entries on Wiktionary (note many of these are rare and low-priority):
- Red-linked Chinese terms in existing translations:
- Appendix:Mandarin Frequency lists (the appendices require some attention - duplicates, definitions and pinyin among others):
- Wu Chinese requiring transliterations (this is a request list, it's important to add those to actual Chinese entries if Wu readings are valid and applicable):
Technical
[edit]- Fix the disappearance of spaces in Pinyin after erhua or toneless-ification in
{{zh-pron}}
. {{zh-pron}}
argument and layout redesign - e.g.|m=xiǎoháiR
,|m=lìdù_
,|m=Ōu-^亞
,|mn=qz:ge̍rh{白}/qz:goa̍t{文}
, Quanzhou: ge̍rh /ɡəʔ²⁴/ {白}, goa̍t /ɡuat̚²⁴/ {文}.- Add phonetic bopomofo, e.g. "ㄑㄧㄥˊ ㄅㄨˋ ㄗˋ ㄐㄧㄣ [Phonetic: ㄑㄧㄥˊ ㄅㄨˊ ㄗˋ ㄐㄧㄣ]" for 情不自禁.
- Redesign modules for dialectal pronunciation table for single-character entries, e.g. more flexible entry of pronunciations, no display of location if there is no pronunciation for that location, incorporation of Module:zh/data/dial.
- Entry header level check, best if automatic whenever there is a ==Chinese== section. Headers should have defined relative levels.
- List Cantonese characters with only one pronunciation, and generate a list of compound pronunciations for entries making use of these characters which are currently pronunciation-less. Batch upload after review. Make
{{zh-new}}
automatically add the Cantonese pronunciation if it is entirely predictable. - See points #9, 10 in #Entry maintenance above.
- Rewrite Wiktionary:About Chinese/references in Lua, allowing one to locate the (highlighted) specific reference after the click.
- Error check functions in Module:zh-pron and its submodules, detecting errors such as: (1) incorrect diacritic placement in Pinyin; (2) invalid Jyutping syllables; (3) space before and after the parameter names (e.g.
| m = xxx
); etc. - General display improvement, more dynamic features (e.g. hover), more JavaScript usage, etc.
- Think about a general JSON-like structure of term data storage and incorporate ID's for senses, instead of the currently ad-hoc extraction of glosses by
{{zh-forms}}
. - Centralise all usage examples and quotations in a backend database and assign each word in the example to a sense in the entry. Make the database queryable. Automatically display examples and quotes on the entry.
- Synonyms, Antonyms, See-also terms and Dial-syn display format redesign, with terms placed directly underneath senses with an improved layout, perhaps with drop-downs similar to quotations ▼.
- IPA for the sub-dialects of Teochew: Bangkok, Cambodia, Chaoyang (潮陽), Chaozhou (潮州), Chenghai (澄海), Huilai (惠來), Jieyang (揭陽), Kalimantan, Nan'ao County (南澳), Puning (普寧), Raoping (饒平), Shantou (汕頭).
- Support for more sub-dialects of the Southern group of Eastern Min: Fuqing dialect, Gutian dialect.
- Support for more Northern Wu lects (with use of Wugniu): Suzhounese, Ningbonese, Hangzhounese?
- Support for more lects, including:
- Dialects with known transliteration systems
- Southern Wu (Wugniu or otherwise -- Wenzhounese - Wiktionary:About_Chinese/Wenzhounese; maybe "w-o" or "w-w" for the parameter?)
- Northeastern Mandarin (Harbin; maybe "m-h" or "m-ne" as a parameter)
- Hainanese Min (Hainan Romanized or Hainanese Pinyin; "mh" for Wenchang and "mh-h" for Haikou, for the parameter?)
- Pu-Xian Min (cpx or mp; see Hinghwa Romanized)
- Taiwanese Hakka (maybe "h-t"; Hailu, Dabu, Raoping, Zhao'an) - Taiwanese Hakka Romanization System
- Shaozhou Tuhua ("sz" or "st"; See this document from Unicode)
- Lower Yangtze Mandarin (Yangzhou; see w:zh:扬州话拉丁化字母表; "m-y"?; Nanjing; see w:zh:南京話拉丁化方案; "m-l" or "m-n"?)
- Other Christian-Romanized lects: Ningbo Wu, Hangzhou Wu, Taizhou Wu, Jinhua Wu, etc. see w:zh:教會羅馬字
- Dialects without known transliteration systems
- Hailufeng Min (maybe list it, separately, as "mn-h", with "sw" for Shanwei, "hf" for Haifeng, "lf" for Lufeng)
- Huizhou (or simply Hui, with either "hz" or "hu" for the parameter respectively)
- Mai Chinese ("ma"?)
- Danzhou Chinese ("dz"?)
- Datian Min (maybe "mdt/m-dt"?)
- Shao-Jiang Min (maybe "sm"?)
- Central Min (most certainly "mz" as a parameter; an in-house transliteration system for Central Min could be similar to the one for Northern Min)
- Zhenan Min
- Zhongshan Min
- Pinghua ("p"?)
- Waxiang Chinese (wxa)
- Linking
{{zh-see}}
to relevant etymology/pronunciation sections instead of default Chinese section (when required). See Talk:攰 for more. - Automatic pinyin transliteration of compounds with different pronunciations in mainland China and Taiwan based on Module:zh/data/cmn-tag. See Talk:擁 for more.
- Inclusion of compound words from 《兩岸萌典》 when {{
subst:zh-new/der
}} is invoked. The current output from 《萌典》 contains a significant number of classical Chinese compounds and lacks certain terms commonly used in vernacular speech. Compare 崗(兩岸萌典) and 崗(萌典)- See this revision for an example.
- Done Adding Tongyong Pinyin to zh-pron (First attempt [1])
- Support for Old National Pronunciation.
- Existing simplified forms that are not using newly-encoded characters (example: 牛犅萤→牛𰠫萤).