针对当前输变电设备状态监测数据清洗过程繁琐,易造成信息丢失等问题,利用栈式降噪自编码器对"脏"数据的还原解析能力及异常状态特征提取能力,提出了一种基于栈式降噪自编码器的数据清洗方法。对设备正常工况及异常运行状态数据分别利用栈式降噪自编码器进行训练学习,获取损失函数向量,形成奇异点、缺失数据修复模型和设备异常运行状态数据降噪模型。通过核密度估计确定训练样本损失函数上限和容限时窗,根据测试数据重构误差和异常数据时长与损失函数上限和容限时窗间的关系,对"脏"数据进行分类处理。对某变压器油色谱中总烃含量及某导线温度数据进行清洗,结果表明所提方法能有效辨识奇异点、缺失信息及异常运行状态数据,并对奇异点、缺失值进行修复重构。在设备异常运行时刻,可以有效过滤干扰数据。
The condition monitoring data cleaning of power transmission and transformation equipment has some problems at present,such as loss of information,inaccuracy and complicated procedure.A data cleaning method based on stacked denoising autoencoders is proposed,which has the advantages of reconstructing noisy information and extracting features of abnormal status.The monitoring data of the equipment under normal and abnormal conditions is trained by the stacked denoising autoencoder to obtain the loss function vectors and build the outlier,missing data restoration model and abnormal status data denoising model.The upper limit of loss function and tolerant window are determined by means of kernel density estimation.By comparing reconstruction error and the lasting time of abnormal data with the upper limit of loss function and tolerant window,the dirty data is classified and cleaned by different models.The method is tested on total hydrocarbon concentration of oil chromatography data and conductor temperature data.An analysis of the examples shows that the method is able to automatically identify different types of abnormal data.Singular points and missing data are directly repaired by reconstruction.In the abnormal working condition,the denoising model can extract the actual effective information and eliminate the interference.