Mandarin Pronunciation Differences between Mainland and Taiwan 普通话与国语,语音差异词库

普通话 国语 两岸语音的不同
17万词 依检定等级查询

About differences pronunciation between “Pǔtōnghuà (Common Tongue) and Guóyǔ (National Language)”
什么是「普通话与国语语音差异」?

There are two major systems in Mandarin Chinese: The Mainland uses “Simplified Chinese Characters” with accent “Pǔtōnghuà (Common Tongue)” and the phonetic symbol “Pinyin”, while Taiwan uses “Traditional Chinese Characters” with accent “Guóyǔ (National Language)” and the phonetic symbol “Zhuyin”. The pronunciation standards of these two systems are developed by different authorities and experts. Differences therefore exist between these two systems.
中文有两大标准规范:大陆书写使用「简体字」搭配语音「普通话」及符号「汉语拼音」,台湾书写使用「繁体字」搭配语音「国语」及符号「注音」。 由不同地区的教育主管单位分别制定两者的语音标准,因而存在差异。

Simplified Chinese
简体字
Pinyin
汉语拼音
Common Tongue
普通话
Traditional Chinese
繁体字
Zhuyin
注音
National Language
国语
Parse Mode 整句分析模式

索引模式

Level by HSK and
华语文能力测验TOCFL

ToneOZ.COM/Data” is a pronunciation differences database for the two Mandarin systems, Common Tongue and the National Language. In 2021 we collected 170k mandarin vocabulary entries from the Chinese dictionaries “CC-CEDICT”, “National Language Dictionary”, and “Jieba” the Chinese words segmentation system. 513 characters in 6k vocabularies have been identified with different pronunciation between Common Tongue and National Language. We created a search engine service in ToneOZ.com/data for these data, including a words index and a parsing mode.
澳声通词库」是以通行现代的汉语语音来比对普通话与国语的语音差异。 2021年我们以三个不同的汉语词库为基础(CC-CEDICT, Jeiba, 国语辞典简编本), 比对 17万 个中文词汇的普通话与国语。 经过逐一校对后,我们找到 六千 多个词有语音差异,来自 513 个中文字。 我们以搜寻引擎的形式,将研究成果公开在 ToneOZ.com/data, 免费提供索引及整句分析。

The word index mode lists all the words with pronunciation difference, and their source phrases. The parse mode allow the user to type in a whole Chinese sentence. We will parse it into Chinese words/phrases and highlight the pronunciation differences.
索引模式列出所有有差异音的中文字及来源词汇,整句分析模式允许使用者可输入一整句中文,搜寻引擎会自动标示出差异音的位置。

“ToneOZ.COM/Data” is for the future extension function of our another web service “Phonetic Chinese Article Editor ToneOZ.COM
「澳声通词库」是为了「拼音注音编辑器 ToneOZ.COM」所预备制作的扩充功能。

“Multicultural Policy” is one of the most important commitments in the Australia Education System. Australia schools value the community harmony, intercultural understanding development, and inclusive teaching practices. Over a hundred years, Chinese immigrants from different regions with different cultural backgrounds have joined this country, and now is the second largest migrant community in Australia. By these Australia shared values, “ToneOZ” builds tools and Apps for Chinese teaching with different varieties.
过去一百多年间有来自不同区域不同文化背景的华人在澳洲落脚生根,澳洲2016人口普查显示华语仅次于英文为第二大通行语言,占2.5%。澳大利亚的教育精神是强调民族多样性,包容性,肯定多元文化。澳洲保留各语言的特色,同时增进不同语言使用者之间的相互了解, 维持族群和谐。「澳声通」以这样的澳洲价值为出发点,来制作符合不同背景的教学工具。

Data Quality Assurance
原始数据是怎么来的?如何确保正确性?

In order to eliminate the errors, ToneOZ uses multiple open sources or public Chinese dictionaries to compare with each other, and then manually review all the differences. We choose our data sources by 3 key points:
「澳声通词库」分别从「量大、质佳、通用」三方面来确保词库的正确性。 我们利用多种开放公众使用的中文语音来源相互比对,审核语音差异的部分,排除各辞典的逻辑编辑错误以及破音字:

  • Quantity「量大17万词」

    We use the mainstream dictionaries with high words coverage:
    我们的数据来自数个中文教学时最常用的词库 :

    1. CC-CEDICT , an English-Chinese dictionary since 1997.
    (英语世界最大的 Open Source 英汉辞典)

    2. Jieba, Chinese text segmentation system.
    (大陆结巴中文分词系统)

    3. 国语辞典简编本, Ministry of Education Mandarin Chinese Dictionary
    (国语教学标准字典)

  • Quality「质佳」

    We verify the pronunciation with official documents :
    语音资料的查证参考以下各教育主管机关的标准文件 :

    a. 普通话异读词审音表
    (1985年版, 大陆《普通话水平测试》标准)

    b. 国语一字多音审订表
    (1999年版, 台湾中文教学标准)

    c. 国语辞典简编本
    (台湾现代语音)

    d. 国语辞典重编本
    (台湾古典语音)

    e. CC-CEDICT
    (2020年版, 普通话拼音, 英语地区汉语教学常用)

  • Modern Chinese「通用」

    While conflict occur, we verify with the pronunciation from the local News or entertainment video programs:
    语音差异部分的查核,是透过搜寻引擎或影音平台,与当地电视节目主播的发音做比对 :

    i. Baidu 百度

    ii. Tencent 腾讯

    iii. Youtube

资料来源授权

“ToneOZ Database” uses the following data sources under the condition of “Collection (encyclopedias)”:
「澳声通词库」是以「汇编 」(百科全书)的形式来使用以下网路词库授权。我们仅使用词语及语音资料,未使用辞典内任何词语注释:

「澳声通词库」本身的使用授权

About the license of the “ToneOZ Dabase” :
「澳声通词库」本身包含两种创作,只有第一种开放授权:

1. The text and pronunciation data in this ”ToneOZ Database“ are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0) (data download) , while the original data are licensed from Sources 1,2, and 3 as mentioned above.
中文繁简词汇与语音资料采用「创用CC姓名标示BY相同方式分享SA」授权。 您可免费使用「澳声通词库(下载)」,个人或商业使用均可。

2. The external lookup table (Traditional, Simplified, Phonetic Chinese) and the search program in this “ToneOZ Database” are parts of another tool “Phonetic Chinese Article Editor ToneOZ.COM“. They are copyright protected intellectual properties of the Australian company “Tone A To Z”. Copy or link to these libraries are prohibited.
数据库以外的繁简发音查表及搜寻引擎程序,属于本公司的另一个教学工具「拼音注音编辑器 ToneOZ.COM」,尚未开放授权第三方使用。

Usage
使用方法

There are two major systems in Mandarin Chinese: The Mainland uses “Simplified Chinese Characters” with accent “Pǔtōnghuà (Common Tongue)”, while Taiwan uses “Traditional Chinese Characters” with accent “Guóyǔ (National Language)”. The standard of the pronunciations of these two systems are developed by different authorities and experts. Differences therefore exist between these two systems.
中文有两大标准规范:大陆书写使用「简体字」搭配语音「普通话」,台湾书写使用「繁体字」搭配语音「国语」。 由不同地区的教育主管单位分别制定两者的语音标准,因而存在差异。

There could be more than one pronunciation for a single Chinese word/phrase, depending on the words before and after in a sentence. ToneOZ has reviewed all of the options to identify which word/phrase has a regional pronunciation difference. If there is, we will only show that case in the search result.
汉语字词可能有一种或多种多音字唸法。「澳声通词库」会将字词的所有唸法及使用情境都加以考虑, 来决定是否有两岸语音差异。在显示查询结果时,我们会优先显示有两岸语音差异的唸法。若语音没有差异,我们会列出”较常用的其中一种语音”作为范例。 详细字词解释及唸法请参阅外部辞典连结。

Index Mode
「索引列表」列出全部差异音字

In default (leave the search box blank), ToneOZ will show an index list of all the words with pronunciation differences. E.g.:
不输入任何字直接查询,会索引列出所有存在差异音的字。例如下图:

The word 「浑」has pronounciation difference in 7 phrases
「浑」这个字找到 7个词有语音差异。
The word 「突」is listed in HSK and TOCFL
「突」这个字则在 HSK 及 华测 检定考试中有出现。

Click any word in the index list to show all the phrases. E.g.: 「烃」
点选任一个索引字便会列出所有语音差异词,例如点选「烃」这个字便会列出5个词含有「烃」且语音有差异:

「烃」is “qīng” in National Language
国语唸「qīng ㄑㄧㄥ」
「烃」is “tīng” in Common Tongue
普通话唸「tīngㄊㄧㄥ」

Parse Mode
「整句查询」繁简词语语音


Type in a Chinese Sentence to parse which words/phrases have pronunciation differences. E.g.: 「企鹅作息」will be parsed as 2 phrases 「企鹅」and「作息」, then the words「企」and「息」will be highlighted as they have pronunciation differences.
繁简中文均可,可混用。例如输入「企鹅作息」,澳声通词库可辨识出「企鹅」「作息」两个词,并指出「企」「息」这两个字的普通话与国语语音有差异。 若分词失误,请在词汇前插入空格,例如输入「廿日记载」会得到「廿、日记、载」,此时请在中间加入空格「廿日 记载」便可得到「廿、日、记载」其中「载」存在语音差异。

Parse Mode 整句分析模式

Options 搜寻选项

We have 3 options:
有三种进阶选项:

  • Differences position
    「差异音位置」
  • Neutral tone
    「轻声差异」
  • Words Source
    「词源」

Differences position
差异音位置

For example, the word 「养」:
举例:如果我们想研究「养」这个字 :

Any contain
「包含单字的词」

Search phrases contain the word 「养」, no matter the pronunciation difference it has or not. E.g.: You can get 「休养」 although this phrase does no have any difference in the pronunciation.
可以找出所有包含「养」这个字的词,无论有或没有语音差异。例如找到「休养」这个词,没有语音差异。

Contain & diff
「包含单字,单字也为差异音」

Search and 「养」must be pronounced differently. E.g.: You can get 「供养」because the word 「养」has different pronunciation in this phrase「供养」. It is tone 3 in Common Tongue, but it is tone 4 in National Language.
可找到「养」语音有差异的词。例如「供养」的「养」在国语中声调变为4声,普通话的声调仍然维持3声。

Contain & diff in any word
「包含单字,词中任意字为差异音」

Search 「养」, and any other word in the same phrase has a pronounciation difference. E.g.: 「休养生息」contains 「养」but no diff in the pronounciation, however it contains 「息」that has a difference. It is tone 1 in Common Tongue, but it is tone 2 in National Language.
可找到有包含「养」,且任何字有语音差异的词。例如找到「休养生息」,「养」没有语音差异,「息」国语为2声,普通话为1声。

Neutral tone
轻声差异

Neutral tone (Tone 5) is a special tone in Chinese, in comparison with other 4 tones (1 high 2 rising 3 low 4 falling). The 4 tones could be changed to Neutral tone depending on situations. In most cases, words with Neutral tone pronunciation are commonly used in both Common Tongue and National Language. In default, ToneOZ ignores the differences in neutral tone. You can manually select the following options:
轻声(5, neutral tone)在中文中是一个特别的声调,其他声调(1 high 2 rising 3 low 4 falling)的字有时会搭配语气改变成轻声。请注意轻声的差异并不完全代表语音有差异,同样的轻声唸法大多数在普通话与国语两种场合都可以通行。

“All Tones 12345” or “Tone 5 only”
「包含轻声差异(12345)」或
「只搜寻轻声差异(5)」

To show the differences in neutral tone and highlighted with dash lines. E.g.: The word 「识」 is tone 4 in the National Language, but tone 2 or 4 in the Common Tongue. However the phrase 「认识」 is usually marked as tone 4 in most of the Common Tongue dictionaries.
可列出轻声差异,会以虚线框显示。例如查询「识」可以发现有语音差异,国语中声调为4声,普通话为2声,而「认识」这个词在普通话字典中大部分「识」被标为轻声。

Words Source
词源

Level by HSK and
华语文能力测验TOCFL

“Hanyu Shuiping Kaoshi (HSK 汉语水平考试)” and “Test of Chinese as a Foreign Language (TOCFL 华语文能力测验)” are two common language proficiency tests for non-native Chinese speakers in Australia. Both of them provide a “words list” to identify which words are more modern and commonly used in daily Chinese conversations. ToneOZ can limit the search range to phrases only from these two lists.
对于母语非华语的学生,会透过大型检定考试来作做为中文学习成果的参考。「汉语水平考试(HSK)」及「华语文能力测验(TOCFL)」是目前澳洲常见的两种中文检定考,两者都有提供常用词表做为出题依据(请参考以下连结)。澳声通词库可以限定只搜寻词表中的常用词。

HSK表示「汉语水平考试」。

’03 TOCFL is the TOCFL word list between 2003~2020
’03 TOCFL 表示 2003年开始使用的「华语八千词」。

’21 TOCFL is the TOCFL word list after 2021
’21 TOCFL 表示 2021年开始使用的「国教院词语表」(TBCL)。

For following links are the original official sources of these words list:
以下为原始词源官方下载连结:

* 资源中心–汉语考试服务网 Chinesetest.cn. Retrieved 2 May 2015. Link 新汉语水平考试(HSK)词汇(2012年修订版) 

* 华语八千词

* Taiwan Benchmarking Chinese Language (TBCL) 台湾华语文能力基准 国教院三等七级词表 (2021)