语言作为一个复杂适应系统有着自身的演化和发展规律,采用计量方法可探究语言的系统特性。本文在依存句法的理论框架下,利用大规模依存树库,通过构建层级结构计量指标,如节点数、层级数、层级距离、平均层级距离和英语平均层级距离,对英语句子的层级结构进行了量化分析。结果发现,英语句子的节点数、层级数和平均层级距离三者存在显著的正相关;层级数的分布呈正偏态,表明英语句子倾向于采用扁平型层级结构,且节点数越大偏态越明显;英语平均层级距离的阈值为4。
As a complex adaptive system, language abides by certain laws in the course of evolution and development, which can be investigated through quantitative approaches. We use a large English dependency treebank to analyze the hierarchical structure of English sentences. Some particular measures such as vertice number ( VN ) , hierarchical number (HN) , hierarchical distance (HD) , mean hierarchical distance (MHD) and English mean hierarchi- cal distance ( MHD2 ) are introduced to evaluate the hierarchical structure of English sentences. We observe that there are significantly positive correlations between VN, HN, and MHD. A positive skew is also found in the distribution of HN, which indicates a preference for flatter structures in English sentences, and the skewness coefficient is increasing for more vertices. Furthermore, we propose a threshold below 4 for English MHD2.