提出一种基于信息测度的数据集序列约减方法,研究如何从序列中抽样出具有较小相关性,同时不丢失具有重要物理特征的数据集.方法具有普适性,应用于激光与等离子体相互作用模拟程序的结果数据中,减少数据集间的相关性和信息冗余度,单个数据集的平均信息量较原数据集序列增加30%左右.
We propose a data reduction approach based on information theory.It comprises sampling of datasets based on mutual entropy and truncation based on offline Marginal Utility.The approach is a universal method for multi-dimensional scientific dataset streams.To show applicability,results obtained with plasma simulation data are presented.It reduces relationship and redundancy between datasets.