Pinyin Letter Frequency 拼音字母頻率
The text used is Chinese translation of “The Masque of the Red Death” by Edgar Allan Poe.
[see The Masque of the Red Death]
[see 紅死病的面具 The Masque of the Red Death By Edgar Allan Poe]
Here's the first paragraph, in Chinese character and in pinyin.
hua shuo “ hong si ” zai guo nei si nue yi jiu ， xiang zhe ban zhi ming ， zhe ban ke pa de wen yi wei shi wei ceng you guo 。 zhe bing de ju ti biao xian he te zheng jiu shi chu xie —— yi pian yin hong ， ling ren fa zhi 。 huan zhe chu shi gan dao ju tong ， tu ran yi zhen tou hun yan hua ， yu shi quan shen mao kong da liang chu xie sang ming 。 zhi yao huan zhe de shen shang ， te bie shi lian shang yi chu xian xing hong se ban dian jiu shi ran shang zhe wen yi de yu zhao ， zhe shi zhu qin hao you shui ye bu gan jin shen qu jiu hu ta he wei wen ta 。 huan zhe cong de bing dao fa bing ， yi zhi dao song ming ， huan bu xiao ban xiao shi gong fu 。
full text masque_of_red_death_chinese_pinyin.txt
The Chinese character to pinyin is done by https://github.com/lxneng/xpinyin
Pinyin and Keyboard Layout
Here we try to find out which keyboard layout is best for input Chinese with pinyin input method.
[see Dvorak Keyboard Layout]
Pinyin Letter Frequency Problem, the Removal of V
There is a interesting issue about v and ü in Chinese pinyin. In pinyin, the letter v is not used, but you have ü. However, for pinyin input system, you have a hack of typing v for ü, because otherwise ü is hard to type.
on Microsoft Windows's pinyin input, u also do ü. But not on MacOS.
So, now there is a interesting question when you compile statistics of pinyin letter frequency. Given a piece of Chinese text, you can translate them into pinyin, then compute the letter frequency. In this way, you'll see zero use of v. However, this is not a proper stat for the purpose of keyboard layout, because, people do type v, while your stat no use of the key v.
To fix it, one needs to convert ü to v, then, compute the statistics. But this may not be readily done, because in order to do that, the software that convert chinese into pinyin will need to include tones to create ü.
But, this “error” isn't too bad. Because the character ü in pinyin does not occur frequently. I think mostly it's only used for the chars 女 綠.
Ergonomic keyboard Layouts
- Most Efficient Layout?
- Maltron vs Dvorak
- Colemak vs Workman
- Typing Multi Layouts
- Dvorak Layout
- Hardware vs Software Dvorak
- Myth of QWERTY vs Dvorak
- Dvorak vs Programer's Dvorak
- Dvorak vs Colemak
- Blank Keycaps vs Labeled Keys
- List of Dvorak Keyboards
- Qwerty to Dvorak, A PhD thesis, 1978
- International Layouts
- QWERTZ, AZERTY
- German Ergonomic
- New French Layout
- French Ergonomic
- French Letter Frequency
- Russian Layout and Programing
- Portuguese Ergonomic
Chinese Input Methods
- Pinyin Letter Frequency 拼音字母頻率
Japanese Input Method
- Japanese Char Frequency
If you have a question, put $5 at patreon and message me.