字符串内核是为分析蛋白质顺序数据的流行工具,他们成功地被用于许多计算生物学问题。传统的绳核假设不同子串是独立的。然而,子串能高度由于他们的基础关系或普通物理化学的性质被相关。这份报纸建议二种加权的光谱核:关联光谱核和 AA 光谱核。我们由预言 12 glycans 的 glycan 有约束力的蛋白质评估他们的表演。结果证明关联光谱核和 AA 光谱核为将近所有比光谱核更好显著地表现 12 glycans。由比较不同物理化学的性质构造的 AA 光谱内核的预兆的力量,作者能也识别贡献大多数到 glycan 蛋白质绑定的物理化学的性质。结果显示在蛋白质的氨基酸的物理化学的性质在 glycan 蛋白质绑定的机制起一个重要作用。
String kernels are popular tools for analyzing protein sequence data and they have been successfully applied to many computational biology problems. The traditional string kernels assume that different substrings are independent. However, substrings can be highly correlated due to their substructure relationship or common physico-chemical properties. This paper proposes two kinds of weighted spectrum kernels: The correlation spectrum kernel and the AA spectrum kernel. We evMuate their performances by predicting glycan-binding proteins of 12 glycans. The results show that the correlation spectrum kernel and the AA spectrum kernel perform significantly better than the spectrum kernel for nearly all the 12 glycans. By comparing the predictive power of AA spectrum kernels constructed by different physico-chemical properties, the authors can also identify the physico- chemical properties which contributes the most to the glycan-protein binding. The results indicate that physico-chemical properties of amino acids in proteins play an important role in the mechanism of glycamprotein binding.