Jump to content

Appendix:Strange CJKV characters

From Wiktionary, the free dictionary

Cangjie · Four Corner · Mandarin (Pinyin · Wade-Giles · Zhuyin) · Cantonese (Jyutping · Pinyin · Zhuyin) · Japanese On (Romaji · Furigana) · Japanese Kun (Romaji · Furigana) · Korean (Revised Romanization · Romanization · ROK Hangul · DPRK Hangul) · Radical · Total Strokes · Ideographic variations · Strange characters

This appendix lists "strange" characters as defined in the Unihan database and this document. The list reflects version 2, in use since 2023. They are split into 12 categories.

Category A: asymmetric

[edit]

In this category are characters for which it would be expected to be symmetric or have a symmetric part, but do not.

Block Code point Ideograph Symmetric equivalent of the character / part (unofficial)
B U+2074E 𠝎
U+21CFF 𡳿
U+24C03 𤰃
U+26A03 𦨃 𦨄
U+26B69 𦭩

Category B: Bopomofo

[edit]
English Wikipedia has an article on:
Wikipedia

In this category are characters which look similar to Bopomofo / Zhuyin characters.

Block Code point Ideograph Bopomofo equivalent Phonetic equivalent
URO U+4E02 U+310E IPA(key):
U+4E05 U+3112 IPA(key): ɕ
U+4E29 U+3110 IPA(key):
U+4E2B U+311A IPA(key): a
U+4E40 U+311F IPA(key): ei
U+4E5C U+31A4 IPA(key): e
U+5DDC U+310D IPA(key): k
U+5E00 U+312D IPA(key): ɻ̩~ʐ̩, ɹ̩~z̩
A U+3405 U+3128 IPA(key): u
U+37A2 U+3113 IPA(key): ʈʂ
B U+20000 𠀀 U+311B IPA(key): o
U+20005 𠀅 U+311E IPA(key): ai
U+200CA 𠃊 U+31B9
U+200CB 𠃋 U+3125 IPA(key): əŋ
U+200D2 𠃒 U+311D IPA(key): e
U+2010E 𠄎 U+310B IPA(key): n
U+206A3 𠚣 U+3109 IPA(key): t
U+20AD3 𠫓 U+310A IPA(key):
U+20B9A 𠮚 U+3116 IPA(key): ɻ~ʐ
U+21C23 𡰣 U+3115 IPA(key): ʂ
U+21FE8 𡿨 U+3111 IPA(key): tɕʰ
G U+3018A 𰆊 U+3117 IPA(key): ts

Category C: Cursive

[edit]

In this category are characters with cursive components, i.e. they include a stroke that is not one of the standard ones.

Block Code point Ideograph Cursive part
URO U+4E44 The ⿾㇂ stroke.
B U+201AD 𠆭 The bottom stroke.
U+201C7 𠇇 Right vertical self-intersecting stroke.
U+2034B 𠍋 The bottom self-intersecting stroke of the middle component.
U+20AB3 𠪳 The bottom self-intersecting stroke.
U+211A2 𡆢 The enclosed self-intersecting stroke.
U+219B9 𡦹 The inner twice self-intersecting stroke.
U+219D1 𡧑 The middle self-intersecting stroke.
U+22013 𢀓 The curly stroke intersecting the vertical line.
U+26E57 𦹗 The left bottom self-intersecting stroke.
F U+2CEF7 𬻷 The ㇔ stroke with a small hook.
U+2CEFF 𬻿 The bottom stroke.
U+2CF02 𬼂 The bottom tilde-like stroke.
U+2D047 𭁇 The intersecting stroke.
U+2D143 𭅃 Two small strokes in the top left with small hooks at the ends.
U+2D37B 𭍻 The right crescent-like stroke.
U+2D44A 𭑊 The ㇔ stroke with a small hook.
U+2D6A5 𭚥 The bottom stroke is an erroneous form of the stroke ㇡, which may be "de-cursivized" in some fonts (e.g. Babelstone).
U+2D92A 𭤪 According to the official chart, the whole character is cursive, but right part is often straightened to , while the left (originally derived from ) is cursive.
U+2D95F 𭥟 While the left component could be realized as ㇌, the right component is always cursive.
U+2E4E0 𮓠 The ㇔ stroke with a small hook.
U+2E979 𮥹 The two ㇔ strokes with a small hook.

Category F: Fully reflective

[edit]

In this category are characters that have components that are mirrored. The standard includes a reference, where the repeated component is not mirrored.

Block Code point Ideograph Reference Reflected component (unofficial)
URO U+56CD
U+71DB
U+81E6 𦣦
U+81E9
B U+21155 𡅕 𠶮
U+221D6 𢇖
U+22374 𢍴 𦣞
U+223FD 𢏽 𢎥
U+23960 𣥠 𣥖 𱤻,
U+244EB 𤓫 𨸏
U+24570 𤕰 ,
U+249A1 𤦡
U+268E9 𦣩 𦣦
U+286DC 𨛜 𨙨,
U+287A0 𨞠 𨙨,
U+28944 𨥄 ,
U+28CC8 𨳈 𠃛, 𠁣
U+28E85 𨺅 𨸏
U+28F31 𨼱 𨸏
U+28F44 𨽄 𨸏
U+28F5D 𨽝 𨸏
U+28F61 𨽡 𨸏
U+28F69 𨽩 𨸏
U+28F74 𨽴 𨸏
U+28F75 𨽵 𨸏
E U+2B935 𫤵 ,
U+2BA23 𫨣 ㇯髙口
U+2BC92 𫲒 ㇯髙口
U+2BE2A 𫸪
U+2C1BB 𬆻 ㇯髙口
U+2C30B 𬌋 ,
U+2C6FF 𬛿 ,
U+2CD18 𬴘 ㇯髙口
U+2CD1C 𬴜 ㇯髙口
U+2CD20 𬴠 ㇯髙口
U+2CD21 𬴡 ㇯髙口
U+2CD22 𬴢 ㇯髙口
U+2CD23 𬴣 ㇯髙口
U+2CD24 𬴤 ㇯髙口
U+2CD25 𬴥 ㇯髙口
U+2CD26 𬴦 ㇯髙口
F U+2D5B2 𭖲 〾口
G U+31044 𱁄 𨸏
H U+3156D 𱕭
U+31879 𱡹
U+31B31 𱬱 𠂤
U+321A1 𲆡 〾邑

Category H: Hangul component

[edit]
English Wikipedia has an article on:
Wikipedia

In this category are characters that have components derived from Hangul.

Block Code point Ideograph Hangul component
URO U+5DEA
A U+3514
U+3516
U+3AB2
U+3AB3
U+3AC7
U+3AC8
U+439E
B U+200CD 𠃍
C U+2A8B3 𪢳
U+2A941 𪥁
F U+2D03B 𭀻
U+2D1BE 𭆾
U+2D81A 𭠚
U+2D939 𭤹
U+2D94B 𭥋
U+2DA58 𭩘
U+2E78C 𪥁
G U+301C8 𰇈
U+30255 𰉕
U+30481 𰒁
U+30912 𰤒
U+30BEE 𰯮
U+30C2F 𰰯
U+30D97 𰶗
U+30F18 𰼘

Category I: Incomplete

[edit]

In this category are characters which look like they are missing some strokes.

Block Code point Ideograph Full character (reference)
URO U+4E04
U+4E05
U+4E52
U+4E53
U+4FAA
U+5187
U+5242
U+54DC
U+56EC
U+5B52
U+5DDC
U+6324
U+66F1
U+6D4E
U+7534
U+8002
U+8080
U+8110 𦜝
U+8360
U+86F4
U+8DFB 𨂋
U+9701 𬰁
U+9F50
U+9FB0
A U+382A
U+39B0
U+3C50
U+4AA3
B U+20016 𠀖
U+20017 𠀗
U+2002A 𠀪
U+2002B 𠀫
U+20035 𠀵
U+2003D 𠀽
U+20063 𠁣
U+20064 𠁤 西
U+20080 𠂀
U+20092 𠂒
U+20099 𠂙
U+2009A 𠂚
U+2009B 𠂛
U+200B3 𠂳 𥾜
U+200CF 𠃏
U+200D2 𠃒
U+200DB 𠃛
U+20118 𠄘
U+20119 𠄙
U+20149 𠅉
U+2015B 𠅛
U+2017E 𠅾
U+201B1 𠆱 𬽫
U+2053E 𠔾
U+20546 𠕆
U+20936 𠤶
U+209B1 𠦱
U+20A64 𠩤
U+20AB1 𠪱 𠪾
U+20B0A 𠬊
U+20B35 𠬵 𥸩
U+20B6B 𠭫
U+20EDA 𠻚
U+20F28 𠼨
U+2115B 𡅛
U+21245 𡉅
U+21246 𡉆
U+213CE 𡏎
U+21428 𡐨
U+2151C 𡔜
U+21556 𡕖
U+216F7 𡛷 𪥱
U+219AD 𡦭
U+219D8 𡧘
U+21C23 𡰣
U+22053 𢁓
U+22064 𢁤
U+220FB 𢃻
U+2218D 𢆍
U+221AF 𢆯
U+221BD 𢆽
U+221CA 𢇊
U+221CB 𢇋
U+22324 𢌤
U+2239E 𢎞
U+223B1 𢎱 𢎨
U+223BA 𢎺 𢏛
U+223C0 𢏀
U+224B4 𢒴
U+2267B 𢙻
U+22779 𢝹 𢞖
U+22868 𢡨
U+22877 𢡷
U+22994 𢦔
U+22AC2 𢫂
U+22BBD 𢮽
U+23D11 𣴑
U+24993 𤦓 𤨎
U+225A9 𢖩
U+22606 𢘆
U+22634 𢘴
U+226C0 𢛀
U+2298F 𢦏
U+22998 𢦘
U+22A6F 𢩯 𡥅
U+22AA2 𢪢
U+22ACE 𢫎 𢬘
U+22B61 𢭡
U+22F0B 𢼋
U+2314A 𣅊
U+23150 𣅐
U+23169 𣅩
U+231B9 𣆹
U+231D3 𣇓
U+232B3 𣊳
U+23332 𣌲
U+233C1 𣏁
U+233CA 𣏊
U+23408 𣐈
U+2347D 𣑽
U+23652 𣙒 𪔇
U+236D0 𣛐
U+23943 𣥃
U+239E8 𣧨 𣨁
U+23C16 𣰖 𣰚
U+23C71 𣱱
U+23C96 𣲖
U+23D2B 𣴫
U+23D49 𣵉
U+23DD2 𣷒
U+2404E 𤁎
U+24121 𤄡
U+241D7 𤇗 𤈖
U+2435E 𤍞
U+2437B 𤍻
U+2447B 𤑻
U+244F0 𤓰
U+24642 𤙂
U+248E5 𤣥
U+248E6 𤣦
U+24ADD 𤫝
U+24E48 𤹈
U+24F3D 𤼽 ,
U+24F6A 𤽪
U+2506B 𥁫
U+25100 𥄀
U+25186 𥆆 𥆨
U+2549B 𥒛
U+256C4 𥛄
U+2574C 𥝌
U+25844 𥡄
U+2584C 𥡌
U+25952 𥥒
U+25A5A 𥩚
U+25A88 𥪈
U+25B7B 𥭻
U+25CC1 𥳁 𥲤
U+25D6F 𥵯
U+25F26 𥼦
U+25F94 𥾔
U+25FAC 𥾬
U+25FAD 𥾭
U+26165 𦅥
U+2626A 𦉪
U+2626B 𦉫
U+26285 𦊅
U+26316 𦌖
U+26356 𦍖
U+26419 𦐙
U+264D0 𦓐
U+26541 𦕁
U+265C9 𦗉 𬚦
U+26612 𦘒
U+2664D 𦙍
U+26738 𦜸
U+268FA 𦣺
U+26965 𦥥
U+26A0A 𦨊 𦨎
U+26AF7 𦫷
U+26B34 𦬴
U+26B44 𦭄
U+26B71 𦭱
U+26B81 𦮁 𦰙
U+26B9F 𦮟
U+26BFA 𦯺
U+26BFE 𦯾
U+26C18 𦰘
U+26C66 𦱦
U+26DC3 𦷃
U+27268 𧉨
U+27475 𧑵
U+27538 𧔸
U+275EE 𧗮
U+27607 𧘇
U+2761B 𧘛
U+27825 𧠥 𧠵
U+278D8 𧣘
U+2795B 𧥛
U+2795C 𧥜
U+27968 𧥨
U+2796A 𧥪
U+279B6 𧦶
U+279DF 𧧟
U+27A25 𧨥
U+27C1E 𧰞
U+27C27 𧰧
U+27C28 𧰨
U+27E90 𧺐 𧺙
U+27FCB 𧿋
U+28013 𨀓
U+28029 𨀩
U+28210 𨈐
U+28211 𨈑
U+2844E 𨑎 𨑾
U+28498 𨒘 𨒪
U+284B5 𨒵
U+28538 𨔸
U+28828 𨠨
U+28925 𨤥
U+2895D 𨥝
U+28973 𨥳 𨦉
U+28979 𨥹
U+28C06 𨰆
U+28CC7 𨳇
U+28D8D 𨶍
U+28E8A 𨺊
U+28EDB 𨻛
U+2907E 𩁾
U+2909A 𩂚
U+2928B 𩊋
U+2944E 𩑎
U+2947F 𩑿 𩒕
U+2948B 𩒋 𩒕
U+296D6 𩛖
U+29C0A 𩰊
U+29C0B 𩰋
U+29C1C 𩰜 𩰟
U+29C86 𩲆
U+29C87 𩲇
U+29D2B 𩴫 𫙎
U+29D30 𩴰
U+2A544 𪕄
C U+2AA72 𪩲
U+2AC8E 𪲎
U+2B145 𫅅
D U+2B740 𫝀
U+2B744 𫝄
U+2B7A6 𫞦
E U+2B820 𫠠 ,
U+2B829 𫠩
U+2B84F 𫡏
U+2B851 𫡑 𠂹
U+2BA51 𫩑 ,
U+2BBDB 𫯛 ,
U+2BE8A 𫺊 𭝚
U+2BFED 𫿭
U+2C09B 𬂛
U+2C0C7 𬃇
U+2C52D 𬔭
U+2C889 𬢉
U+2CBC0 𬯀 𫕅
U+2CE3E 𬸾
F U+2CEB0 𬺰
U+2CEB1 𬺱
U+2CEB7 𬺷
U+2CEBB 𬺻
U+2CECC 𬻌
U+2CECD 𬻍
U+2CF2B 𬼫
U+2CF3D 𬼽
U+2D0B8 𭂸
U+2D110 𭄐
U+2D1B1 𭆱
U+2D1C1 𭇁
U+2D57F 𭕿
U+2D6A0 𭚠
U+2D80D 𭠍
U+2D928 𭤨
U+2D95D 𭥝
U+2DA72 𭩲
U+2DA97 𭪗
U+2DC3E 𭰾
U+2DEBD 𭺽
U+2DF9B 𭾛
U+2E39B 𮎛
G U+30006 𰀆
U+3002A 𰀪
U+300E6 𰃦
U+30333 𰌳
U+30367 𰍧
U+30368 𰍨
U+304A8 𰒨
U+306C4 𰛄
U+306C5 𰛅
U+308B5 𰢵
U+308EC 𰣬
U+30A26 𰨦
U+30B67 𰭧
U+31318 𱌘
H U+31378 𱍸
U+313FF 𱏿 , 𠗃
U+31651 𱙑
U+318D3 𱣓
U+31972 𱥲
U+31993 𱦓
U+31AC5 𱫅
U+31AE3 𱫣
U+31B35 𱬵
U+31C75 𱱵
U+31E41 𱹁
U+320D4 𲃔 𧿇
U+3238A 𲎊

Category K: Katakana component

[edit]
English Wikipedia has an article on:
Wikipedia

In this category are characters that include katakana components.

Block Code point Ideograph Katakana letter
B U+211A5 𡆥 U+30C8
U+22016 𢀖 U+30B9
U+282A3 𨊣 U+30C8
C U+2A708 𪜈 U+30E2
D U+2B742 𫝂 U+30C3
E U+2B9A4 𫦤 U+30AB
U+2B9AB 𫦫 U+30AB , U+30CA
U+2BCCD 𫳍 U+30A6 , U+30C3 , U+30DB
U+2C711 𬜑 U+30AB
F U+2CEC0 𬻀 U+30B5
U+2CECB 𬻋 U+30B5
U+2CF00 𬼀 U+30B7 , U+30C6
U+2CF61 𬽡 U+30B9
U+2D580 𭖀 U+30B9
U+2D6DD 𭛝 U+30F1
U+2DF86 𭾆 U+30B1
U+2E307 𮌇 U+30B9
U+2E695 𮚕 U+30B1
H U+31FDD 𱿝 U+30B1

Category M: Mirrored

[edit]

In this category are characters that are mirrored or include mirrored components. The only exception is U+26B62 (𦭢), which is included because its construction gives such an impression.

Block Code point Ideograph Unmirrored version (reference)
URO U+4EFA
U+5350
B U+2009C 𠂜 𠂛
U+20141 𠅁
U+2091C 𠤜 𠤗
U+22044 𢁄
U+23944 𣥄
U+23957 𣥗 𣥕
U+2456A 𤕪
U+26B62 𦭢
U+28668 𨙨
U+2907F 𩁿
G U+30002 𰀂
U+30004 𰀄
H U+3193B 𱤻
U+31C1C 𱰜

Category O: Odd component

[edit]

In this category are characters that have different components that are not in line with standard writing practices or are derived from another such symbol.

Block Code point Ideograph Reference
A U+3403
B U+200E0 𠃠
U+20137 𠄷
U+205F1 𠗱
U+20696 𠚖
U+2069C 𠚜
U+206A1 𠚡
U+20953 𠥓
U+20967 𠥧
U+20969 𠥩
U+2096A 𠥪
U+2096B 𠥫
U+2096C 𠥬
U+21261 𡉡
U+242C5 𤋅
U+24548 𤕈
U+26B99 𦮙
U+2700D 𧀍
U+291E7 𩇧
U+2A6D7 𪛗
U+2A6D8 𪛘
U+2A6D9 𪛙
U+2A6DA 𪛚
U+2A6DB 𪛛
U+2A6DC 𪛜
U+2A6DD 𪛝
F U+2CF01 𬼁 ʒ
U+2CF04 𬼄
U+2D1AC 𭆬
U+2DF86 𭾆
U+2DF8B 𭾋
U+2DF8F 𭾏
U+2E34C 𮍌
U+2E34D 𮍍
U+2E350 𮍐
U+2E5D8 𮗘
U+2EA08 𮨈
H U+3136B 𱍫
U+31C44 𱱄

Category R: Rotated

[edit]

In this category are characters that are rotated or have rotated components.

Block Code point Ideograph Reference Rotated component (unofficial)
B U+2010F 𠄏
U+20114 𠄔
U+20432 𠐲
U+20544 𠕄
U+221B4 𢆴 𢆳
U+22A0B 𢨋 𰒲
U+23028 𣀨
U+23952 𣥒 𣥕
U+24173 𤅳
U+24489 𤒉 𤒜
U+24493 𤒓
U+27951 𧥑
U+27E42 𧹂
E U+2BA66 𫩦 𠱃
U+2C886 𬢆
F U+2E5D9 𮗙
G U+304A5 𰒥
U+30A07 𰨇
U+30C9E 𰲞
H U+31E6B 𱹫 𥾇
U+32053 𲁓

Category S: Stroke-heavy

[edit]

In this category are characters have 40 or more total strokes.

Block Code point Ideograph Total strokes Strokes without the radical
URO U+9F98 48 32
A U+4A3B 52 44
U+4C9C 44 33
B U+2053B 𠔻 64 58
U+269C4 𦧄 42 36
U+269C5 𦧅 48 42
U+27198 𧆘 43 39
U+278B1 𧢱 44 37
U+291D3 𩇓 40 32
U+291D4 𩇔 48 40
U+29663 𩙣 46 37
U+29664 𩙤 48 39
U+2A4CA 𪓊 41 29
U+2A68D 𪚍 40 25
U+2A68E 𪚎 40 25
U+2A6A5 𪚥 64 48
E U+2C6A9 𬚩 53 47
F U+2E8F1 𮣱 41 33
G U+30EDD 𰻝 43 39
U+30EDE 𰻞 58 54
U+30F54 𰽔 76 68
U+3106C 𱁬 84 76
H U+317DB 𱟛 64 60
U+31C46 𱱆 41 36
[edit]

The Unicode database is released by Unicode Inc. under the following terms:

Copyright © 1991-2022 Unicode, Inc. All rights reserved. Distributed under the Terms of Use in https://www.unicode.org/copyright.html.

Permission is hereby granted, free of charge, to any person obtaining a copy of the Unicode data files and any associated documentation (the "Data Files") or Unicode software and any associated documentation (the "Software") to deal in the Data Files or Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of the Data Files or Software, and to permit persons to whom the Data Files or Software are furnished to do so, provided that either (a) this copyright and permission notice appear with all copies of the Data Files or Software, or (b) this copyright and permission notice appear in associated Documentation.

THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA FILES OR SOFTWARE.

Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in these Data Files or Software without prior written authorization of the copyright holder.