面向自然语言处理的词汇语义研究应该以词汇的计量研究为基础。该文在评述汉语词汇计量研究的主要成果以后,提出一个汉语常用词知识库的建设任务,并给出常用词表的构造性定义、词表常用性的定量评价方法以及"部件词"的概念,最后介绍现代汉语常用词知识库的总体设计和已经做的工作。期望常用词知识库的建设能为汉语词汇语义学研究、为中文信息处理事业的发展做出贡献。
Natural language processing oriented lexical semantics researches should be based on quantitative study of the lexicon. After a brief suvey on the main achievements of the quantitative Chinese lexicon, this paper proposes a project to build a knowledge base of commonly used words, for which we describe 1) a constructive definition of commonly used words list, 2) a quantitative method to measure the coverage of a given word list over an annotated corpus, and 3) the concept of "component word". We also introduce the overall designs of the knowledge base and the current progress of this project. It is expected that the construction of such a knowledge base can contribute to the Chinese lexical semantics researches and the development of Chinese information processing.