东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于MapReduce技术的并行集成分类算法

ISSN号：1000-0801
期刊名称：电信科学
时间：2012.7.7
页码：40-47
分类：TP393.03[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]浙江工商大学信息学院,杭州310018, [2]浙江工商大学现代商贸研究中心,杭州310000
相关基金：国家自然科学基金资助项目（No.71071141,No.71071140）,浙江省自然科学基金资助重点项目（No.Z1091224）,国家教育部博士点基金资助项目（No.20103326110001）,浙江省重大科技计划基金资助项目（No.2010C13021）,浙江省自然科学基金资助项目（No.Y6110628,No.Y1090617,No.LQ12G01007）,浙江省研究生科研创新项目
相关项目：融入概念漂移情境的商业数据流挖掘及可靠性研究

作者：琚春华|邹江波|张芮|魏建良|Ju Chunhua1,2,Zou Jiangbo1,Zhang Zui1,Wei Jianlian|2.Center for Studies of Modern Business,Zhejiang G|

关键词：云计算, 集成分类器, 并行集成, MAPREDUCE, cloud computing, ensemble classifier, parallel integration, MapReduce

中文摘要：

由于计算机内存资源限制，分类器组合的有效性及最优性选择是机器学习领域的主要研究内容。经典的集成分类算法在处理小数据集时，拥有较高的分类准确性．但面对大量数据时。由于多基分类器学习、分类共用1台计算机资源，导致运算效率较低，这显然不适合处理当今的海量数据。针对已有集成分类算法只适合作用于小规模数据集的缺点，剖析了集成分类器的特性，采用基于聚合方式的集成分类器和云计算的MapReduce技术设计了并行集成分类算法（EMapReduce），达到并行处理大规模数据的目的。并在Amazon计算集群上模拟实验．实验结果表明该算法具有一定的高效忡和可行件．

英文摘要：

Because of the computer memory resource constraints, the effectiveness of the combination of classifier and the optimal choice is the main contents of the field of machine learning. Classic ensemble classification algorithm in dealing with small data sets with a higher classification accuracy, but the face of large amounts of data, more than the base classifier learning, classification occupy mangy computer resources, leading to low computational efficiency, which is obviously not suited to deal with today＇s massive data. For the already integrated the classification algorithm is only suitable for the role of the shortcomings of small-scale data sets, analyze the characteristics of the ensemble classifier, using the parallel integration algorithm based on the aggregation of the ensemble classifier and cloud computing, MapReduce technology to achieve parallel processing the purpose of the massive scale of data. And in the Amazon compute cluster to simulate the experimental results show that the algorithm has a certain efficiency and feasibility.

同期刊论文项目

融入概念漂移情境的商业数据流挖掘及可靠性研究

期刊论文 40 会议论文 3

　商务信息社会化标注的个性化推荐模型研究

期刊论文 2

基于情境感知的移动工作者内容服务模型与应变调适机制

期刊论文 19 会议论文 4

同项目期刊论文

基于情景特征的前馈动态集成分类器

基于属性关联及匹配差异度的数据流异常检测

多干扰的资源约束项目调度问题

基于情境和主体特征融入性的多维度个性化推荐模型研究

基于隐半马尔可夫模型的用户兴趣特征提取

基于密度与动态阈值的任意形状聚类挖掘算法研究

基于小波网络的数据流偶合特征聚类方法

基于多种方法的共同配送成本分配模型研究

基于地域因素的连锁商业分布式决策树算法

基于粒计算的商业数据流概念漂移特征选择

一种挖掘概念漂移数据流的模糊积分集成分类方法

A new clustering algorithm for uncertain data stream based on the existence probability of tuple

Research on logistics network infrastructures based on dea-pca approach: Evidence from the yangtze r

Integrating the use of spreadsheet software in logistics education: An illustration of the use of Mi

An empirical research on relativity between the allocation of logistics resources and the economy

Feature extracting of business data streams with concept-drifting

Association classification algorithm based on concept correlation

Customer classification based on data mining

基于有序复合策略的数据流最大频繁项集挖掘

基于设计结构矩阵的任务规划新方法

基于支持向量机的分布数据挖掘模型DSVM

Research on credit card fraud detection model based on class weighted support vector machine

基于设计结构矩阵的简化多项目调度问题及其人工免疫网络求解

A New Collaborative Recommendation Approach Based on Users Clustering Using Artificial Bee Colony Al

基于社会化标注的用户协同模型研究

基于大数据的电信领域用户服务模型与数据融合策略研究

关联规则的评价方法改进与度量框架研究

基于社会化评分和标签的个性化推荐方法

融入个体活跃度的电子商务客户流失预测模型

复杂产品碳足迹节点单元渐进式模块化方法研究

融入情境强度的客户行为模式挖掘及变化侦测

基于改进型FP-Tree的分布式关联分类算法

复杂产品供应链碳足迹数据质量的评估与控制

产品供应链碳足迹影响热度分析

基于云服务的电信项目绿色监管服务系统研究

融入能力互补因素的生产联盟伙伴选择研究

基于社会网络协同过滤的社会化电子商务推荐研究

低碳供应链环境下物联网 RFID 可信度研究

Research on Logistics Network Infrastructures Based on DEA-PCA Approach： Evidence from the Yangtze River Delta Region in China

Context-Aware Tour Planning System Based on Satisfaction Model

Research on User Multi-interest Profile based on Social Tagging

The Regional Financial Risk Early-Warning Model Integrating the Regression of Lagging Factors

Personal Recommendation Using a Novel Collaborative Filtering Algorithm in Customer Relationship Man

A modified decision tree algorithm based on genetic algorithm for mobile user classification problem

多干扰的资源约束项目调度问题

一种新型匿名认证方案的研究

A context information ontology hierarchy model for tourism-oriented mobile E-commerce

电子商务专业分类人才培养体系的探索

A Multilevel Model for Measuring Fit between a Firm’s Competitive Strategies and Information S

基于设计结构矩阵的任务规划新方法

A Novel Method of Data Stream Clustering Based on Wavelet Timing Series Tree Synopsis

A Novel Heuristic Web Search Algorithm based on Gaussian Mutation and Clonal Selection Strategy

基于社会化标注的用户协同模型研究

基于大数据的电信领域用户服务模型与数据融合策略研究

关联规则的评价方法改进与度量框架研究

中间商采纳视角的制造企业移动分销技术用户采纳研究

基于社会化标注的用户协同模型研究

期刊信息

《电信科学》
北大核心期刊（2011版）

主管单位:中国科学技术协会
主办单位:中国通信学会人民邮电出版社
主编：韦乐平
地址：北京市丰台区成寿寺路11号邮电出版大厦8层
邮编：100078
邮箱：dxkx@ptpress.com.cn
电话：010-81055443

国际标准刊号：ISSN：1000-0801
国内统一刊号：ISSN：11-2103/TN
邮发代号:2-397

获奖情况:
获第二届全国优秀科技期刊评比三等奖（1997年）,获中国科协优秀科技期刊二等奖（1997年）,在第四次邮电科技期刊质量检查评比中荣获优秀科技...,国家新闻出版总署将《电信科学》列为“中国期刊方...,获第三届中国科技优秀科技期刊奖三等奖（2002年）,在第五次通信行业科技期刊质量检查评比中荣获优秀...,在第六次通信行业科技期刊质量检查评比中荣获优秀...,2008年再次入选《中文核心期刊要目总览》,2009年入选中国科技论文统计

国内外数据库收录:
美国剑桥科学文摘,英国科学文摘数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:12435