为实现大规模数据流抽样的公平性与可行性,设计一种自适应数据流公平抽样算法。根据网络负载状况,自适应地调整抽样间隔,对数据流进行分段采样,采集初始样本;根据数据流的大小,以反比例函数为抽样函数,对初始样本进行概率抽样。通过对数据流采集过程的两阶段控制,实现在资源有限的情况下,对数据流进行公平合理的抽样。仿真结果表明,相较其它抽样算法,使用该算法抽样的样本更加公平准确。
To achieve fairness and feasibility of large-scale data flow sampling,an adaptive data flow fair sampling algorithm was presented.According to the network load conditions,the sampling interval was adjusted adaptively,and the data flow was sampled sectionally to collect the initial samples.The initial samples were selected again using probability function,which is an inverse function according to the size of data flow.By controlling the two processes of sampling,the data flow can be sampled fairly and reasonably in the case of limited resources.Results of simulation show that compared with other sampling algorithms,the samples selected using the algorithm are more fair and accurate.