该文提出了一种全新的分层式依存句法分析方法。该方法以依存深度不大于1的依存层作为分析单位,自底向上构建句子的依存结构。在层内,通过穷尽搜索得到层最优子结构;在层与层之间,分析状态确定性地转移。依存层的引入,使该模型具有比典型的基于图的方法更低的算法复杂度,与基于转换的方法相比,又一定程度上缓解了确定性过程的贪婪性。此外,该方法使用典型序列标注模型进行层依存子结构搜索,证明了序列标注技术完全可以胜任句法分析等层次结构分析任务。实验结果显示,该文提出的分层式依存分析方法具有与主流方法可比的分析精度和非常高的分析效率,在宾州树库上可以达到每秒2 500个英语单词。
A layer-based projective dependency parsing approach is presented.This novel approach works layer by layer in a bottom-up manner,in which the depth of token dependency is allowed no more than one.Inside the layer the dependency graphs are searched exhaustively while between the layers the parser state transfers deterministically.Taking the dependency layer as the parsing unit,the proposed parser has a lower computational complexity than graph-based models which search for a whole dependency graph,alleviating the error propagation in transition-based models to some extent.Furthermore,our parser adopts the sequence labeling models to find the optimal sub-graph of the layer,which demonstrates the sequence labeling techniques qualified for hierarchical structure analysis tasks.Experimental results indicate that the proposed approach offers desirable accuracies and especially a very fast parsing speed,with 2500 words per second for Penn Treebank.