针对发音信息在话音环境中并不容易得到的问题,提出了一种从听觉信号中预测发音信息的语音反演方法。论文应用远端监督学习(DSL),对语音反演机器学习策略进行研究,并对其实验背景和理论依据进行了分析。论文在提出一种对远端监督学习逆模进行全局优化的方法的同时,通过应用八个声道变量作为发音信息来模拟语音动力学,对语音信号分别被参数化为声学参数(APs)和梅尔频率倒谱系数(MFCCs)时的预测结果进行了比较。结果表明远端监督学习对声道变量有较好的预测性能。
To the problem that articulatory information is not readily available in typical speakerlistener situations, a method that esti mates articulatory information from the acoustic signal is proposed, namely speech inversion. It selectes distal supervised learning (DSL) as one of machine learning strategies for speech inversion to study, and analyzes the experiment's background and theoretical foundation of distal supervised learning. It proposes that use a global optimization approach for the inverse model of distal supervised teaming and eight tract variables as articulatory information to simulate speech dynamics, the results when speech signal is parameterized as acoustic parameters (APs) and as melfrequency cepstral coefficients (MFCCs) are compared in the paper. The results show that distal super vised learning has a good estimation performance for tract variables.