音字转换是中文信息处理领域的一个重要研究方向,在语音识别、中文拼音输入中都有广泛应用。该文对音字转换中的拼音流切分歧义问题做了分析与研究,发现传统的分层隐马尔可夫解码模型在解决这个问题时存在缺陷,提出了利用语言模型知识辅助拼音流切分来改进已有的分层模型的思想。实验表明,与传统方法相比,该文的方法可以将首字准确率提高3%。
Pinyin-to-Character conversion is an important task in Chinese Information Processing with widely applications in such tasks as Chinese Speech Recognition, Chinese Pinyin input method et al. This paper investigates the Pinyin-to-Character conversion and the segmentation of pinyin stream and proposes a method using Language Model to improve pinyin stream segmentation model. This method achieves about 3 % enhancement in precision of the first character compared to the traditional hierarchical model.