该文融合遍历论、粗粒化方法和信息论的观点研究数据流的非平稳性度量问题.引入了数据流的非平稳性度量的概念,给出了数据流非平稳性度量的有效的近似算法.数据流的非平稳性度量为0和1之间的实数,平稳性较好的数据流的非平稳性度量较小.作者将数据流的非平稳性度量应用到模型选择问题中,提出残差序列非平稳性度量最小化的模型选择标准.作者用数值试验检验了该文提出的数据流非平稳性度量的近似算法,并检验了其作为模型选择标准的能力.数值试验的结果表明,非平稳性度量是衡量数据流非平稳程度的一个合理指标,可以很好地区分趋势平稳数据和差分平稳数据,区分独立同分布序列、白噪声序列和鞅差序列.
We study the nonstationarity measure for data streams by integration ideas from ergodic theory, coarse grain and information theory. We introduce nonstationarity measure for data streams. An effective approximation algorithm is designed for implementation. The nonstationarity measure is a real number between 0 and 1. The nonstationarity measure is smaller for a more stationary data stream. We apply the nonstationarity measure to model selection, and propose a criterion for model selection which requires least nonstationarity mea- sure for residual sequence. Numerical experiments are performed to test our approximation algorithm and to validate the least nonstationarity measure as a criterion for model selection. The numerical results indicate that the nonstationarity measure is a sound index to compare the level of nonstationarity among data streams. By comparing the nonstationarity measure, we can distinguish trend-stationary process and difference-stationary process effectively, and discern i.i.d, sequence, white noise sequence and martingale difference sequence.