电力企业通常根据电力负荷数据,采用传统的K-Means算法对客户进行划分,而这种方法最大的缺陷就是必须由用户手动指定聚类簇数。提出了一种将Canopy算法和K-Means算法结合应用于负荷聚类的方法,无需手动指定聚类簇数。收集到的用户历史用电数据,使用并行计算框架MapReduce对原始数据进行预处理。应用Canopy和K-Means算法建立自动负荷聚类模型。在真实用电数据上进行实证分析,通过使用Silhouette指标对结果进行评估,证明提出的方法更加稳定和具有广泛的适用性。
The electrical power enterprise usually based on power load data,uses the traditional K-Means algorithm to classify the customers,but the biggest drawback of this method must be specified by the user manual clustering number of clusters.It proposes a method combining Canopy algorithm and K-Means algorithm based on load clustering,without the need to manually specify the number of clusters,the automatic division of the customer.First of all,it collects users’electricity data,uses the parallel computing framework MapReduce to preprocess the original data.Then,it uses Canopy and K-Means algorithm to establish the clustering model of automatic load.Finally,in the real consumption data on the empirical analysis,by using the Silhouette index to evaluate,it shows that the proposed method is more stable and convenient,and has wider applicability.