位置:成果数据库 > 期刊 > 期刊详情页
基于语法树高度的汉语韵律短语预测
  • 期刊名称:计算机工程与应用
  • 时间:0
  • 页码:139-143+167
  • 语言:中文
  • 分类:TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]西北师范大学物理与电子工程学院,兰州730070, [2]清华大学深圳研究生院,广东深圳518005, [3]清华大学计算机系,北京100084
  • 相关基金:国家自然科学基金面上项目(No.60875015); 教育部科学研究重点项目(No.208146)
  • 相关项目:汉语文语转换中语义与表现力联合建模
中文摘要:

在文语转换系统中,从文本中预测出准确的韵律结构对于提高合成语音的自然度具有重要的作用。利用10 000句标注了词性标记的文本语料,在语言学专家的指导下,人工标注了语料的韵律词和韵律短语。选择了标注结果一致性最高的500句语句,标注了语法层级结构,并利用语法树高度描述语法词之间连接的紧密程度。通过分析韵律短语边界与语法结构的关系,发现韵律短语边界受语法树高度、语法词词性和语法词词长的影响,因此选择了这三个特征,利用TBL算法和400句训练语句训练了预测模型。测试集上的预测结果表明,提出的方法在小规模训练语料下,韵律短语预测的精确率达到了75.2%,召回率达到了77.1%,F-Score达到了76.1%。

英文摘要:

Predicting precise prosodic structure is one of the most important aspects for improving the naturalness of synthe-sized speech in text to speech synthesis.10000 sentences with part-of-speech tags are used for manually labeling prosodic word and prosodic phrase under a linguistic experts’ guidance.A new feature based on the height of syntax tree is proposed in the paper for describing the degree of closeness between the adjacent lexicon words according to the hierarchal syntax structure.Analysis on the relationships between height of syntax tree and prosodic phrase boundary shows that the height of syntax tree,part-of-speech and length of lexicon word are the three most important features for prosodic phrase boundary pre-diction.Therefore these three features are employed to train a TBL based model with 400 training sentences.Experiments dem-onstrate that the approach achieves 75.2% of precision,77.1% of recall and 76.1% of F-Score.

同期刊论文项目
同项目期刊论文