东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于两阶段分类的口语理解方法

ISSN号：1000-1239
期刊名称：《计算机研究与发展》
时间：0
分类：TP391.2[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]上海交通大学计算机科学与工程系,上海200240
相关基金：国家自然科学基金项目（60496326）;国家“八六三”高技术研究发展计划基金项目（2001AA114210-11）

作者：吴尉林[1], 陆汝占[1], 段建勇[1], 刘慧[1], 高峰[1], 陈玉泉[1]

关键词：口语对话系统, 口语理解, 统计分类器, 主题分类, 决策表, spoken dialogue system, spoken language understanding, statistical classifier, topic classification, decision list

中文摘要：

口语理解是实现口语对话系统的关键技术之一.它主要面临两方面的挑战：1）稳健性,因为输入语句往往是病态的;2）可移植性,即口语理解单元应能够快速移植到新的领域和语言.提出了一种新的基于两阶段分类的口语理解方法：第1阶段为主题分类,用来识别用户输入语句的主题;第2阶段为主题相关的语义槽分类,根据识别的主题抽取相应的语义槽/值对.该方法能对用户输入语句进行深层理解,同时也能保持稳健性.它基本上是数据驱动的,而且训练数据的标记也比较容易,可方便地移植到新的领域和语言.实验分别在汉语交通查询领域和英语DARPA Communicator领域进行,结果表明了该方法的有效性.

英文摘要：

Spoken language understanding （SLU） is one of the key components in a spoken dialogue system. One challenge for SLU is robustness since the speech recognizer inevitably makes errors and spoken language is plagued with a large set of spontaneous speech phenomena. Another challenge is portability. Traditionally, the rule-based SLU approaches require linguistic experts to handcraft the domain-specific grammar for parsing, which is time-consuming and laboursome. A new SLU approach based on two-stage classification is proposed. Firstly, the topic classifier is used to identify the topic of an input utterance. Then, with the restriction of the recognized target topic, the semantic slot classifiers are trained to extract the corresponding slot-value pairs. The advantage of the proposed approach is that it is mainly data-driven and requires only minimally annotated corpus for training whilst retaining the understanding robustness and deepness for spoken language. Experiments have been conducted in the Chinese public transportation information inquiry domain and the English DARPA Communicator domain. The good performance demonstrates the viability of the proposed approach.

同期刊论文项目