User:Robert Ullmann/Pronunciation statistics
Jump to navigation
Jump to search
pronunciation section statistics
- from XML dump as of 13 June 2008
- total of 81775 pronunciation sections in 873169 entries
- total of 119785 pronunciation lines, average 1.465 per section
Pronunication lines by type of line:
- "accent" means the line has an
{{a}}
template, followed by (enPR), IPA and (SAMPA). These lines are not listed under enPR, IPA etc - IPA, SAMPA, etc are others with those templates or text on the line.
- "qualifier" means a line usually starting with * that is like "* RP:" or uses {qualifier} for the same thing.
- Classification of lines is not exact.
- SAMPA includes X-SAMPA, IPA includes lines with
{{IPAchar}}
. - blanks, comments, sister links, etc are not included in totals
line type | occurs |
---|---|
accent | 3490 |
AHD | 234 |
IPA | 46977 |
IPA/SAMPA | 2463 |
SAMPA | 4247 |
enPR | 650 |
enPR/IPA/SAMPA | 1357 |
ad hoc | 7811 |
audio | 24624 |
homophones | 532 |
hyphenation | 9165 |
rhymes | 14223 |
comment | 167 |
image | 37 |
qualifier | 1953 |
rfc/rfp/rfap | 399 |
sisterlink | 134 |
table syntax | 900 |
other | 760 |
number of lines in sections, counting all types except blank lines, comments, sister, image
lines | occurs | pages (if 10+ lines in a section) |
---|---|---|
0 | 146 | |
1 | 56248 | |
2 | 17054 | |
3 | 6126 | |
4 | 1313 | |
5 | 399 | |
6 | 134 | |
7 | 107 | |
8 | 69 | |
9 | 32 | |
10 | 22 | been father you're entrance advocate poor novem manganese marathon 0 transport nasty whore marriage data foray plaque articulate transpose animate Mars Robin |
11 | 25 | alphabetical acerbity maroon read though drawer marten abeille manzana magnesium polytonic use abo hooter record contract associate hendecasyllabic duplicate monosyllabic polysyllabic pentasyllabic octosyllabic trisyllabic pitää varansa |
12 | 10 | pneumonoultramicroscopicsilicovolcanoconiosis project copper chalk complex março excuse clerk gnocchi Martin |
13 | 1 | thorn |
14 | 2 | our ت |
15 | 1 | قابلة |
16 | 1 | solder |
28 | 2 | hello Celtic |
29 | 1 | atomic |
number of lines in sections by language, for languages with 10 or more pronunciation sections
language | sections | lines | average |
---|---|---|---|
Albanian | 11 | 11 | 1.000 |
Ancient Greek | 3433 | 3609 | 1.051 |
Arabic | 34 | 75 | 2.206 |
Aramaic | 517 | 517 | 1.000 |
Armenian | 11 | 18 | 1.636 |
Aromanian | 19 | 19 | 1.000 |
Asturian | 20 | 20 | 1.000 |
Basque | 13 | 14 | 1.077 |
Bengali | 27 | 27 | 1.000 |
Breton | 32 | 35 | 1.094 |
Bulgarian | 717 | 729 | 1.017 |
Catalan | 45 | 54 | 1.200 |
Classical Nahuatl | 441 | 463 | 1.050 |
Croatian | 22 | 22 | 1.000 |
Czech | 1025 | 1064 | 1.038 |
Danish | 63 | 83 | 1.317 |
Dutch | 2773 | 3091 | 1.115 |
Egyptian | 50 | 56 | 1.120 |
English | 27869 | 47354 | 1.699 |
Esperanto | 22 | 31 | 1.409 |
Estonian | 17 | 18 | 1.059 |
Ewe | 120 | 210 | 1.750 |
Faroese | 1620 | 1639 | 1.012 |
Fijian Hindi | 87 | 87 | 1.000 |
Filipino | 40 | 41 | 1.025 |
Finnish | 6411 | 12674 | 1.977 |
French | 6348 | 11117 | 1.751 |
Ga | 18 | 35 | 1.944 |
Gamilaraay | 70 | 70 | 1.000 |
German | 1471 | 1748 | 1.188 |
Greek | 344 | 354 | 1.029 |
Guugu Yimidhirr | 25 | 25 | 1.000 |
Hebrew | 400 | 420 | 1.050 |
Hungarian | 2634 | 5151 | 1.956 |
Icelandic | 269 | 388 | 1.442 |
Indonesian | 15 | 28 | 1.867 |
Interlingua | 10 | 13 | 1.300 |
Irish | 777 | 891 | 1.147 |
Isthmus Zapotec | 10 | 10 | 1.000 |
Istro-Romanian | 21 | 21 | 1.000 |
Italian | 911 | 980 | 1.076 |
Japanese | 128 | 143 | 1.117 |
Jingpho | 34 | 34 | 1.000 |
Kabyle | 11 | 2 | 0.182 |
Kashubian | 25 | 25 | 1.000 |
Korean | 2450 | 2589 | 1.057 |
Lao | 244 | 244 | 1.000 |
Latin | 290 | 470 | 1.621 |
Lithuanian | 99 | 99 | 1.000 |
Lojban | 96 | 99 | 1.031 |
Macedonian | 28 | 28 | 1.000 |
Mandarin | 4911 | 5995 | 1.221 |
Martuthunira | 28 | 28 | 1.000 |
Megleno-Romanian | 11 | 11 | 1.000 |
Min Nan | 1598 | 2098 | 1.313 |
Moldavian | 15 | 15 | 1.000 |
Navajo | 25 | 25 | 1.000 |
Norwegian | 74 | 76 | 1.027 |
Occitan | 11 | 12 | 1.091 |
Old English | 1662 | 1689 | 1.016 |
Old Prussian | 105 | 204 | 1.943 |
Persian | 146 | 155 | 1.062 |
Polish | 2009 | 2473 | 1.231 |
Portuguese | 202 | 284 | 1.406 |
Romanian | 3970 | 4027 | 1.014 |
Russian | 1391 | 1615 | 1.161 |
Scots | 234 | 247 | 1.056 |
Scottish Gaelic | 173 | 178 | 1.029 |
Serbian | 63 | 64 | 1.016 |
Seri | 14 | 16 | 1.143 |
Slovene | 18 | 18 | 1.000 |
Spanish | 911 | 1052 | 1.155 |
Swedish | 1336 | 1422 | 1.064 |
Tagalog | 26 | 26 | 1.000 |
Tok Pisin | 10 | 10 | 1.000 |
Translingual | 79 | 97 | 1.228 |
Turkish | 42 | 327 | 7.786 |
Vietnamese | 56 | 88 | 1.571 |
Volapük | 15 | 40 | 2.667 |
Warlpiri | 11 | 11 | 1.000 |
Welsh | 138 | 146 | 1.058 |
Western Apache | 14 | 15 | 1.071 |
Xhosa | 13 | 13 | 1.000 |
Yiddish | 78 | 79 | 1.013 |