PolyU Corpus of Spoken Chinese
v1.3 (released on
1 January 2015)
This corpus is a set of audio-recordings of
conversational exchanges in Chinese between interviewers and interviewees
discussing a wide range of subjects, including travel talk, and life
experiences. There are presently 28 transcripts. These transcripts are rendered
in Chinese characters.
The creation of this corpus was made possible by
the following grants (PI: Dr Foong Ha YAP):
"Stance Marking in Asian Languages:
Linguistic and Cultural Perspectives" (RGC GRF Grant 2010-2013, PolyU
5513/10H)
"Non-referential Uses of Nominalization
Constructions: Asian Perspectives" (HKPU Internal Grant, 2010-2013, HKPU
1-ZV6W)
"Establishing Common Ground in Public
Discourse: An Analysis of Electoral Speeches, Press Conferences and Q&A
Sessions in Hong Kong" (PolyU ICRG, 2012-2014, HKPU G-YK85)
We are carrying on updating this corpus. More data
will be uploaded in future release.
Suggestions, feedback, queries and comments are
welcome and can be directed to Wong Tak-sum at
wong_taksum@hotmail.com .
PolyU Corpus of Spoken Chinese, Department of English, Hong Kong Polytechnic University, Modified 4 June 2015, Retrieved DATE, from <http://asianlang.engl.polyu.edu.hk/> .
Click here to search this corpus
Cantonese Discourse
Data
ID# of
Participant |
Questionnaire |
Travel Pictures |
Ritual Pictures |
Free conversation |
||||
Informant 1 |
Sound Track |
Sound Track |
Sound Track |
Sound Track (4'33") |
||||
Informant 4 |
Sound Track |
Sound Track |
Sound Track |
|||||
Informant 6 |
Sound Track |
Sound Track |
Sound Track |
|||||
Informant 13 |
Sound Track |
Sound Track |
Sound Track |
|||||
Informant 14 |
Sound Track |
Sound Track |
Sound Track |
|||||
Informant 15 |
Sound Track |
Sound Track |
Sound Track |
|||||
Informant 16 |
Sound Track (9'56") | Transcript | Sound Track (12'38") | Transcript | Sound Track (17'37") | Transcript | ||
Informant 17 |
Sound Track (11'14") | Sound Track (16'35") | Sound Track (20'34") | |||||
Informant 18 |
Sound Track (8'41") | Transcript | Sound Track (14'1") | Transcript | Sound Track (21'49") | Transcript | ||
Informant 19 |
Sound Track (7'39") | Transcript | Sound Track (17'46") | Transcript | Sound Track (30'23") | Transcript | ||
|
|
|
||||||
All informants
(4h 5'35") |
Sound Track (37'30") | All transcripts | Sound Track (1h 1') | All transcripts | Sound Track (1h 30'23") | All transcripts |
Cantonese Debates Hosted by RTHK
Date |
Geographical Constituency Areas concerned |
Length |
Transcript |
18 Aug 2012 |
Hong Kong Island |
1h 6’43” |
Transcript |
19 Aug 2012 |
Kowloon East |
44’46” |
Transcript |
25 Aug 2012 |
New Territories East |
1h 7'04" |
|
26 Aug 2012 |
Kowloon West |
44’42” |
Transcript |
01 Sept 2012 |
New Territories West |
1h 7'03" |
|
02 Sept 2012 |
District Council (Second) |
44'50" |
|
Total time duration |
5h 35’08” |
Note: Given
that a Chinese character may correspond to more than one morpheme and have more
than one pronunciation, sometimes there is no one-to-one correspondence between
a Chinese character and its pronunciation. Jyutping romanization of the
character is thus tagged when there is potential ambiguity.
Mandarin Discourse
Data
ID# of
Participant |
Questionnaire |
Travel Pictures |
Ritual Pictures |
Free conversation |
||||
IE_01 |
Sound Track |
Transcript |
Transcript | Sound Track | Transcript | Sound Track | Transcript | |
IE_05 |
Transcript | Transcript | Transcript | Transcript | ||||
IE_06 |
Sound Track |
Transcript | Transcript |
Sound Track |
Transcript |
Sound Track |
Transcript | |
IE_07 |
Transcript | Transcript | Transcript | Transcript | ||||
IE_08 |
Sound Track |
Transcript | Transcript |
Sound Track |
Transcript |
Sound Track |
Transcript | |
IE_09 |
Sound Track |
Transcript | Transcript |
Sound Track |
Transcript |
Sound Track |
Transcript | |
IE_10 |
Sound Track |
Transcript | Transcript |
Sound Track |
Transcript |
Sound Track |
Transcript | |
IE_11 |
Sound Track 6.12 |
Transcript | Transcript |
Sound Track |
Transcript |
Sound Track |
Transcript | |
IE_12 |
Transcript | Transcript | Sound Track (9'13") | Transcript | Transcript | |||
IE_18 | Sound Track (11'9") | Transcript | Sound Track (8'44") | Transcript | Sound Track (3'57") | Transcript | ||
IE_19 | Sound Track (6'1") | Transcript | Sound Track (9'20") | Transcript | Sound Track (12'1") | Transcript | ||
IE_20 | Sound Track (7'32") | Transcript | Sound Track (11'2") | Transcript | Sound Track (17'37") | Transcript | Sound Track (6'5") | Transcript |
IE_FC01 |
Sound Track |
Transcript | ||||||
IE_FC02 |
Sound Track |
Transcript | ||||||
IE_FC03 |
Sound Track |
Transcript | ||||||
IE_FC04 |
Sound Track |
Transcript | ||||||
IE_FC05 |
Sound Track |
Transcript | ||||||
IE_FC07 |
Sound Track |
Transcript | ||||||
IE_FC08 |
Sound Track |
Transcript | ||||||
IE_FC09 |
Sound Track |
Transcript | ||||||
IE_FC11 | Sound Track | Transcript | ||||||
All informants
(3h 7'41") |
Sound Track
(12'37") |
All transcripts |
Sound Track
(2h 43'48") |
All transcripts | Sound Track (hh hh' ss") | All transcripts |
Sound Track
(11’16”) |
All transcripts |
List of research publications and presentations
that have benefited from data from this corpus
Publications
Yap, Foong Ha, Ying Yang and Tak-Sum Wong. (accepted). On the development of sentence final particles (and utterance tags) in Chinese. In The Role of the Left and Right Periphery in Semantic [Studies in Pragmatics Series], Kate Beeching & Ulrich Detges (eds). Bingley, UK: Emerald Publishers.
Yap, Foong Ha, Winnie Chor and Jiao Wang. (2012). On the development of epistemic ‘fear’ markers: An analysis of Mandarin kongpa and Cantonese taipaa. Covert Patterns of Modality, Werner Abraham Elisabeth Leiss (eds), 312-342. Cambridge, UK: Cambridge Scholars.
Conference Presentations
Yang, Ying, Foong Ha Yap and Tak-Sum Wong. (2012). “I am sure but I hedge”: fear expression kongpa as a rhetorical interactive strategy in Mandarin conversation. Paper presented at the Workshop on Epistemicity, Evidentiality and Attitude in Asian Languages: Typological, Diachronic and Discourse Perspectives, Hong Kong Polytechnic University, September 3-5.
Yap, Foong Ha and Winnie Oi-wan Chor. (2012). Epistemic downgrading in Cantonese conversations. Paper presented at the 20th Annual Conference of the International Association of Chinese Linguistics (IACL-20), Hong Kong Polytechnic University, September 3-5.
We would
appreciate hearing from you if your publications or conference presentations
have made use of or referred to results based on this corpus.
Acknowledgements
We wish to thank the members the following
research team members who worked on different stages of this corpus:
Preparation of Interview Questions:
CHOR Winnie Oi-wan
Interviewers:
CHAN Shuk-ling Ariel
YANG Ying Vivien
Transcribers:
CHAN Shuk-ling Ariel
CHAN Yu-kwan Daniel
CHING Yuk-yin Jessie
KONG Pui-yu Polly
LAM Chi-fung
MIN Wei Phyllis
SIU Pui-shan Gloria
TONG Ka-tai Rosanne
YUNG Hiu-lam Landia
Transcription Editors:
WONG Tak-sum Sam (Cantonese)
YANG Ying Vivien (Mandarin)
Corpus Website Supervisor:
Links
to Other Corpora
Cantonese
Early
Cantonese Colloquial Texts: A Database
Early
Cantonese Tagged Database
A Linguistic Corpus of Mid-20th Century Hong Kong Cantonese Paper
Hong Kong Cantonese Child Language Corpus (CANCORP)
The Hong Kong Bilingual Child Language Corpus
English Loanwords in Hong Kong Cantonese
Ideophones in African and Asian Languages
Hong Kong Cantonese Corpus (HKCanCor) POS Tagset Paper Site2
The Hong Kong Cantonese Adult Language Corpus (HKCAC) Paper1 Paper2
A Parallel Corpus of Spoken Cantonese and Written Chinese Paper
Cantonese Chinese Corpus of Oral Narratives (CANON) Paper
Hong Kong Mid-1990s Newspaper Corpus Paper
PolyU Corpus of Spoken Chinese
Mandarin
Academia
Sinica
Academia
Sinica Balanced Corpus of Modern Chinese
A Socio-phonetic Study of Spoken Taiwan Mandarin
Mandarin Topic-oriented
Conversation Corpus
Beijing Language and Culture University
Tagged Corpus of People's Daily Paper1
Corpus of Active Written Samples for HSK
Beijing Foreign Studies University
Texts of Recent Chinese (TORCH) 2009
Chilin (Hong Kong) Limited
Linguistic Variations in Chinese Speech Communities (LIVAC) Synchronous Corpus
Chinese Academy of Sciences
SCTB: A Chinese Treebank in Scientific Domain
The Chinese University of Hong Kong
The CUHK Discourse Treebank for Chinese Paper
Communication
University of China
Mass Media
Language Corpus of Audio and Video Materials
Management
System of Broadcast Media Language Corpus
Resources
of Neology Research
Datatang
Chinese Weibo Syntactic Treebank with 50k Sentences
Harbin Institute
of Technology
Hong Kong Polytechnic University
Lancaster University
The Lancaster Corpus of Mandarin Chinese (LCMC) version 1 version 2
Ministry of
Education
Corpus On-line:
National
Chengchi University
The NCCU Corpus of Spoken Mandarin
National Taiwan
University
Taiwan Corpus of
Child Mandarin (TCCM)
Peking University
Center
for Chinese Linguistics (CCL) Corpus
Diachronic Retrieval System of Lexical Items of Modern Chinese
The Peking University Multi-view Chinese Treebank
The Peking University Chinese Treebank
Tsinghua Univeristy
University of California, Los Angeles
T
University of Pennsylvania
Classical Chinese
Academia Sinica
Academia Sinica Tagged Corpus of Old Chinese
Academia
Sinica Tagged Corpus of Middle Chinese
Academia
Sinica Tagged Corpus of Early Mandarin Chinese
City University of Hong Kong
CityU Treebank of Classical Chinese Poems
A Dependency Treebank of Chinese Buddhist Texts Paper1 Paper 2
The Chinese
University of Hong Kong
Chinese Ancient Texts
(CHANT) Database Access via PolyU Library
A Database on the
Chu Bamboo Manuscripts of Guodian 郭店楚簡資料庫
The Hong Kong
Ministry of
Education
Corpus On-line: Corpus of Classical Chinese
Peking University
Center for Chinese Linguistics (CCL) Corpus
The University of Sheffield
Sheffield Corpus of Chinese for Diachronic Linguistics Study
University of Washington
Hokkien
Academia Sinica
Southern Min Archives: A Database of Historical Change and Language Distribution
Min and Hakka Language Archives
The Texts Database of Folk Songs in Southern Min Dialect preface
iCorpus Mandarin Taiwanese Bilingual Corpus Online System (Subscription Required)
National
Chengchi University
Ÿ The NCCU Corpus of Spoken Southern Min
National Chung Cheng University
Ÿ A Spoken Corpus of Taiwan Southern Min
Ÿ Taiwanese Child Language Corpus (TAICORP)
National Museum of Taiwan Literature
Ÿ Digital Archive Database for Written Taiwanese (DADWT)
National Taichung University of Education
Memory of the Written Taiwanese
Other Sinitic Varieties
Ÿ The NCCU Corpus of Spoken Hakka
Ÿ Association for Conversation of Hong Kong Indigenous Languages 香港本土語言保育協會
Spoken English in Hong Kong
Ÿ Hong Kong Corpus of Spoken English
Ÿ Hong Kong Corpus of Surveying and Construction Engineering
Ÿ Hong Kong Engineering Corpus
Ÿ Hong Kong Financial Services Corpus
Ÿ Hong Kong Budget Speeches Corpus 1997-2010
Ÿ Hong Kong Policy Address Speeches Corpus 1997-2009
Other Parallel Corpus
Ÿ Multilingual and Multimodal English-Mandarin-Cantonese-Japanese Parallel Corpus (Subscription Required)
Links to Cantonese References
On-line Dictionaries
Ÿ
Ÿ CantoDict 2003
Ÿ
現代標準漢語與粵語對照資料庫
Ÿ Dictionary of Cantonese Slang
On-line Pronunciation Dictionaries
Ÿ 粵音資料集叢 2014
Ÿ S. L. Wong's A Chinese Syllabary Pronounced according to the Dialect of Canton 黃錫凌《粵語韻彙》電子版 1996
Ÿ
Ÿ 粵語發音詞典
Ÿ CKC Online Chinese Dictionary 縱橫在線中文字典
Ÿ Lexical Items with English Explanations for Fundamental Chinese Learning in Hong Kong Schools 中英對照香港學校中文學習基礎字詞
Ÿ Association for Conservation of Hong Kong Indigenous Languages Pronouncing Dictionary 香港本土語言保育協會發音字典
Printed Dictionaries
虞學圃、溫岐石 1782:《江湖尺牘分韻撮要合集》。澳大利亞國家圖書館館藏 韻典網
Ÿ Morrison, Robert. 1828. Vocabulary of the Canton Dialect 廣東省土話字彙. Macao: East India Company's Press. Reprint
Ÿ Williams, Sameul Wells. 1856. A Tonic Dictionary of the Chinese Language in the Canton Dialect 英華分韻撮要. Canton: Office of the Chinese Repository. Reprint
Chalmers, John. 1859. An English and Cantonese Pocket-Dictionary 英粵字典. Hong Kong: The London Missionary Society's Press. 1862 1870
Lobscheid, W. 1866−1868. English and Chinese Dictionary 增訂英華字典. Hong Kong: The Daily Press Office. A-C D-H I-Q
Eitel, E. G. 1877. A Chinese-English Dictionary in the Cantonese Dialect. London: Trubner & Co. Reprint
Ÿ Ball, James Dyer. 1886. The Cantonese Made Easy Vocabulary 廣東話入門辭彙表. Hongkong: Kelly & Walsh, Ld. 1892 1908
Chalmers, John. 1891. An English and Cantonese Dictionary 英粵字典, 6th ed. Hong Kong: Kelly & Walsh Limited. 1907
Aubazac, Louis. 1912. Dictionnaire Cantonnais-Francais 粵法字典. Hong Kong: Imprimerie de la Société des Missions-Étrangères.
Cowles, Roy T. 1914. A Pocket Dictionary of Cantonese: Cantonese-English with English-Cantonese Index 廣州話袖珍字典. Hong Kong: Hong Kong University Press. 1990 1999
孔仲南 1933:《廣東俗語攷》 。廣州:南方扶輪社。Google Drive
Meyer, Bernard F. & Wempe, Theodore F. 1934. The Student's Cantonese-English Dictionary. Hong Kong, St. Louis Industrial School Printing Press.
黃錫凌 1941:《粵音韻彙 :廣州標準音之研究》 。上海:中華書局。 1970 1991 1998
Chiang, Ker-Ch'iu 蔣克秋. 1956. A Practical English-Cantonese Dictionary 實用英粵詞典. Singapore: Chin Fen Book Store.
馮思禹 1962:《廣州音字彙》。香港:世界書局。
Cowles, Roy T. 1965. The Cantonese Speaker's Dictionary. Hong Kong: Hong Kong University Press.
Huang, Po-Fei Parker. 1970. Cantonese Dictionary. New Haven and London: Yale University Press.
Lau, Sidney 劉錫祥. 1977. A Practical Cantonese-English Dictionary. Hong Kong: The Government Printer.
饒秉才、 歐陽覺亞、周無忌 1981:《廣州話方言詞典》。香港:商務印書館。 2013修訂版
曾子凡 1982:《廣州話.普通話口語詞對譯手冊》。香港:三聯書店。 1991 1998 2001 2002 2014
饒秉才 1985:《廣州話..普通话雙音對照漢語字典》。香港: 三聯書店。
張日昇、詹伯慧、甘于恩 主編 1987:《珠江三角洲方言字音對照》。香港:新世紀出版社。
張日昇、詹伯慧、甘于恩 主編 1988:《珠江三角洲方言詞彙對照》。香港:新世紀出版社。
張勵妍、張賽洋 1987:《國音粵音索音字彙》。香港: 中華書局。 1997\
饒秉才、周無忌 1988:《廣州話標準音字彙》。香港:商務印書館。
關傑才 1990:《英譯廣東口語詞典》。香港:商務印書館。 2010
香港敎育署語文敎育學院中文系 1990:《常用字廣州話讀音表》。
吳開斌 1991:《簡明香港方言詞典》。廣州:花城出版社。
陳慧英 1994:《實用廣州話詞典》。上海:漢語大詞典出版社。
蘇翰翀 1994:《實用廣州音字典》。廣州: 中山大學出版社。
張日昇、詹伯慧、甘于恩 1994:《粵北十縣市粵方言調查報告》。廣州:暨南大學出版社。
北京大學中國語言文學系語言學敎研室 1995:《漢語方言詞彙》,第二版。北京:語文出版社。
Ÿ Hung, Betty. 1996. Phrases in Cantonese 非常廣東話. Hong Kong: Greenwood Press.
Ÿ Lo, Wood Wai & Tam, Fee Yin. 1996. Interesting Cantonese Colloquial Expressions. Hong Kong: The Chinese University Press.
饒秉才、 歐陽覺亞、周無忌 1997:《廣州話詞典》 。廣州:廣東人民出版社。
鄭定歐 1997:《香港粵語詞典》。南京:江蘇敎育出版社。
麥耘、譚步云 1997:《實用廣州話分類詞典》。廣州:廣東人民出版社。 2011
朱永楷 1997:《香港話普通話對照詞典》。北京:漢語大詞典出版社。
余秉昭 1997:《同音字彙》,修訂版。香港:新亞洲文化基金會有限公司。
魏偉新 1997:《粵講俗語諺語歇後語詞典》。廣州:廣州出版社。
白宛如 1998:《廣州方言詞典》。南京:江蘇敎育出版社,《現代漢語方言大詞典.分卷》本。
詹伯慧、張日昇 1998:《粵西十縣市粵方言調查報告》。廣州:暨南大學出版社。
張勵妍、倪列懷 1999:《港式廣州話詞典》。香港:萬里書店。
許寶華、宮田一郎 主編 1999:《漢語方言大詞典》。北京:中華書局。
何文匯、朱國藩 1999:《粵音正讀字彙》。香港:香港敎育圖書公司。
楊明新 1999:《簡明粵英詞典》。`廣州:廣東高等教育出版社。
New Asia - Yale-in-China Chinese Language Center, CUHK. 1991. English-Cantonese Dictionary 英粵字典. Hong Kong: The Chinese University Press.
詹伯慧 2002:《廣州話正音字典》。廣州:廣東人民出版社。
詹伯慧 2002:《廣東粵方言概要》。廣州:暨南大學出版社。
饒秉才 2002:《廣州音字典(普通話對照)》。香港:三聯書店。 2004修訂版
So, Siu-hing Simon 蘇紹興. 2002. A Glossary of Common Cantonese Colloquial Expressions 英譯廣州話常用口語詞彙. Hong Kong: The Chinese University Press.
Lee, Yungkin Philip. 2003. Pocket Cantonese Dictionary. Hong Kong: Periplus Editions Limited.
曾子凡 、溫素華 2003:《廣州話.普通話 速查字典》。香港:世界圖書出版有公司。
歐陽覺亞、饒秉才、周耀文 2005:《廣州話、客家話、潮汕話與普通話對照詞典》。廣州:廣東人民出版社。
Hutton, Christopher & Bolton Kingsley. 2005. A Dictionary of Cantonese Slang: The Language of Hong Kong Movies, Street Gangs, and City Life. London: Hurst & Company.
Lo, Tam Fee-yin. 2006. Cantonese Colloquial Expressions 廣州話口語詞彙. Hong Kong: The Chinese University Press.
湯志祥 2006:《廣州話•普通話•上海話 6000 常用詞對照手冊》。香港:中華書局。
劉扳盛 2008:《廣州話普通話詞典》。香港:商務印書館。
歐陽覺亞、周無忌、饒秉才 2010:《廣州話俗語詞典》。廣州:廣東人民出版社。
Chiu, Aman 2010:《香港常用俗語小辭典》。香港:青春文化。
Yang, N. 2011. English-Cantonese & Cantonese-English One-to-One Dictionary. Star Foreign Language Books.
Editors of Hippocrene Books (Ed.). 2012. Cantonese-English/English-Cantonese Dictionary & Phrasebook. Hippocrene Books.
Numlake. U. P. 2013. English Chinese Cantonese Dictionary. TraffordSG.
Hippocrene Books (Ed.). 2014. Cantonese-English/English-Cantonese Practical Dictionary. Hippocrene Books.
Kataoka, Shin 片岡新 & Lee, Cream Yin-Ping 李燕萍. 2014. Putonghua-Cantonese-English Converter. Hong Kong: Greenwood Press.
曾焯文 2016:《粵辭正典:健康篇》。香港:文化現場。
Studies in Vocabulary Items and Idioms
詹憲慈 1924:《廣州語本字》,手稿。 1995 Wikipedia
喬硯農 1966:《廣州話口語詞的研究》。香港:華僑語文出版社。
石 人 1983:《廣東話趣譚》。香港: 博益。
石 人 1984:《廣東話再譚》。香港: 博益。
宋郁文 1985:《俗語拾趣》。香港:博益。
丘學強 1989:《妙語方言》。香港:中華書局。
阿 丁 1989:《趣怪香港話》。香港:香港周刊。
吳 昊 1990:《懷舊香港話》。 香港:創藝文化企業有限公司。
陳均潤 1991:《港人自講》。
文若稚 1992:《廣州方言古語選釋》。澳門:澳門日報出版社。合訂本
文若稚 1993:《廣州方言古語選釋(續編)》。澳門:澳門日報出版社。合訂本
黃 氏 1993:《粵語古趣談》。香港:文星圖書有限公司。
楊子靜 1993:《粵語鈎沉──廣州方言俗語攷》。廣州:廣東高等出版社。
彭志銘 1994:《次文化語言:香港新方言概論》。 香港:次文化堂。
吳 昊 1994:《俗文化語言》。 香港:次文化堂。上冊 下冊
曾子凡 1995:《廣州話.普通話語詞對比研究》 ,修訂本。香港:香港普通話研習社。
莊澤義 1995:《省港民間俗語》。香港:海峰。
黃 氏 1997:《粵語古趣談續編》。香港:文星圖書有限公司。
陳伯煇 1998:《論粵方言詞本字考釋》。香港:中華書局。
陳伯煇 1998:《生活粵語本字趣談》。香港:中華書局。
石人(梁小中) 1999:《趣談廣東話》。香港:一本堂。
魯 金 1999:《香江舊語》。香港: 次文化堂。
陳渭泉 2000:《歇後語趣談》。澳門:凌智廣告公司。
陳渭泉 2001:《拙中求趣》。澳門: 凌智廣告公司。
容 若 2001:《粵語國語好雙語》。 香港:次文化堂。
陳渭泉 2002:《笑談歇後語》。澳門:凌智廣告公司。
陳小雄 2005:《地道廣州話用語》。廣州:羊城晚報出版社。
梁仲森 2005:《當代香港粵語語助詞的硏究》。香港:香港城市大學語言資訊科學硏究中心。
潘永強 2005:《擔天望地──廣府俗語探奇》。香港:中華書局。
饒原生 2006:《粵語口頭禪》。廣州:廣東敎育出版社。
陳雄根、何杏楓、張錦少 2006:《追本窮源:粵語詞彙趣談》。 香港:三聯書店。
吳 昊 2006:《港式廣府話研究Ⅰ》。香港:次文化堂。
彭志銘 2006:《正字正確》。香港:次文化堂。
盧活為 2006:《香港話一知半解》。香港:獲益出版事業有限公司。
余一詠 2007:《粵口語與說文》。香港:自資出版。
饒原生 2007:《港粵口頭禪趣解》。香港:洪波出版社。
彭志銘 2007:《正字審查》。香港:次文化堂。
彭志銘 2007:《小狗懶擦鞋》。香港:次文化堂。
彭志銘 2008:《香港潮語話齋》。香港:次文化堂。
曾子凡 2008:《粵語慣用語研究》。香港:香港城市大學出版社。
朱 薰 2008:《朱Fun E潮語大敎訓》。香港:萬里書店。
蘇眞眞 2008:《香港潮語 學習字卡》。香港:Kubrick。
蘇眞眞 2009:《香港潮語 學習字卡【貳】》。香港:Kubrick。
彭志銘 2009:《旺角詞話》。 香港:次文化堂。
彭志銘 2009:《廣東俗語正字考》。 香港:次文化堂。
彭志銘 2010:《次文化語言:香港新方概論》。 香港:次文化堂。
潘永強 2010:《粵語俗話(一)動作篇》。香港:中華書局。
潘永強 2010:《粵語俗話(二)人物•事物篇》。香港:中華書局。
梁慧敏 2011:《潮語解密》。香港:萬里機構。
Ÿ 彭志銘 2011:《香港黑詞典》。 香港:次文化堂。
黃 氏 2012:《粵語古趣談三編》。香港:金石圖書貿易有限公司。
蘇萬興 2014:《講開有段古:老餅潮語》。香港:中華書局。
彭志銘 2014:《香港粵語頂硬上》。 香港:次文化堂。
Ÿ 陳 雲 2015:《廣東雅言》。 香港:次文化堂。
彭志銘 2015:《老師怕問字》。 香港:次文化堂。
彭志銘 2016:《粵港歇後語鈎沉》。 香港:次文化堂。
Links to Tools for Developing Chinese Corpus
On-line Tools
A Chinese Word Segmentation System with Unknown Word Extraction and POS Tagging 中文斷詞系統
CKIP Chinese Parser demo version 中文剖析器線上測試
Chinese Knowledge and Information Processing
Off-line Tools
The Stanford Log-linear Part-of-Speech Tagger
The Stanford Parser: A statistical parser
The Stanford Named Entity Recognizer
Ÿ 結巴中文分詞程式
Natural Language Processing with Python
Overview on Tools for Natural Language Processing in Chinese
中文處理工具簡介
Chinese Natural Language Processing and Speech Processing
Foong Ha YAP, Principal Investigator
PolyU Corpus of Spoken Chinese
9th February, 2013
This corpus was developed by the Stance Project
research team of the Department of English, Hong Kong Polytechnic University.
All proprietary rights reside with said University. This corpus is protected by
copyright laws and international copyright treaties, as well as other
intellectual property laws and treaties. No part of this corpus shall be
reproduced or adapted without prior written permission approved by the Hong
Kong Polytechnic University.
Last Updated: 23 March 2018 11:02 PM
Copyright © 2013 Department of English, Hong Kong
Polytechnic University. All rights reserved.