东篱科研大数据发现系统（DRDS）

欢迎您！东篱公司退出

申报数据库
1. 申报指南
立项数据库
成果数据库
1. 期刊论文
2. 会议论文
3. 著作
4. 专利
项目获奖数据库

位置：成果数据库 > 期刊 > 期刊详情页

朴素贝叶斯算法的MapReduce并行化分析与实现

ISSN号：1673-629X
期刊名称：计算机技术与发展
时间：2013
页码：-
分类：TP31[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
作者机构：[1]同济大学计算机科学与技术系,上海201804, [2]上海证券交易所,上海200120, [3]上海师范大学信息与机电工程学院,上海200234, [4]上海市陈家镇建设发展有限公司,上海202162
相关基金：国家自然科学基金资助项目（61103069,71170148）;国家科技计划课题（2012BAD35801）;上海市科技创新计划（11Dz1501703）;上海信息化发展专项基金（20091015）;上海市科技创新计划（陈家镇）（11Dz1210600）
相关项目：基于语义计算的高维复杂数据降维理论与实证研究

作者：向阳|蒋锐权|张波|张君瑛|

关键词：朴素贝叶斯分类算法, 并行计算, MAPREDUCE, Nave Bayes algorithm , parallel computing , MapReduce

中文摘要：

朴素贝叶斯方法是一种高效的分类算法，但在处理海量数据时由于内存和L／0等资源的局限，该算法的效率受到极大影响。文中针对朴素贝叶斯分类算法特点，给出了基于MapReduce编程模型的实现朴素贝叶斯分类算法的方法。训练集内文件被分割进行处理，核心处理过程由MapReduce完成，M印函数完成对训练文件的解析，Reduce函数完成类别属性和特征属性知识库的构建。实验主要比较了传统算法和改进并行算法的性能，结果表明：在大数据量的情况下使用Ma—pReduce并行化的朴素贝叶斯算法具有良好的执行效率与较高的扩展性。

英文摘要：

Abstract：Naive Bayes is an efficient algorithm. Due to the limitation of memory and I/O resources, the efficiency of the algorithm has been greatly affected in mass data processing. In this paper,proposed a novel Naive Bayes algorithm based on MapReduce programming model. Training set is cut apart before being processed. The core processing procedure is accomplished by MapReduce model. Extraction and parsing of the training set are processed in the Map function. Knowledge base of class and feature attributes are built in the Reduce function. In the experiments, mainly compare the performance of both the traditional algorithm and the improved parallel algorithm. The result of experiments shows that the parallel Naive Bayes algorithm has good efficiency and high scalability in mass data processing.

同期刊论文项目

基于信任链的微博群体情感挖掘研究

期刊论文 26 会议论文 3

基于语义计算的高维复杂数据降维理论与实证研究

期刊论文 74 会议论文 15 获奖 1 著作 1

同项目期刊论文

Constraint-guided Sparsity Preserving Projections for Semi-Supervised Dimensionality Reduction

基于时空π-演算的信息物理融合系统组件可替换性判定

成对约束指导的稀疏保持投影

Semi-supervised Sparsity Pairwise Constraint Preserving Projecting based on GA

Semantic based Social Service Organization Mechanism in Cyber Physical System

Belief and Reputation Based Recommended Trust Computation in Wireless Sensor Networks

基于并行计算的文本分类技术研究

A Novel Public Opinion Mining Method on Microblog Platform

融合稀疏保持的成对约束投影

A Sentiment Delivering Estimate Scheme Based on Trust Chain in Mobile Social Network

基于时空丌一演算的信息物理融合系统组件可替换性判定

基于矩阵分解与用户近邻模型的协同过滤推荐算法

基于特征本体的文本流主提检测研究

基于并行计算的文本分类技术

基于主题本题树的文本流层次主题检测技术

Dimensionality reduction based on low rank representation

Tensor modular sparsity preserving projections for dimensionality reduction

A novel public opinion mining method on miceoblog platform

postgresql 的TPM的实现和改进

基于LSH和MapReduce的近邻模型推荐算法

超对等网络中的轮廓查询优化

基于特征本体的文本流主题演化研究

Construction of Adaptive Text Feature Graphs

A novel reliability assurance method for cyber-physical system components substitution

Trust-aware Information Dissemination in Social Network

Orthogonal Tensor Sparse Neighborhood Preserving Embedding for Two-dimensional Image

一种基于语义的决策服务协作自组织方法

Study on Topic Tree-based Topic Structure Modeling

A Tentative Study on Evolutionary Pattern Mining Of Topic in Text Streams

Study of Topic Life Cycle Based on Hierarchical HMM

Modular Tensor Sparsity Preserving Projection Algorithm for Dimension Reduction

基于特征本体的文本流主题检测研究

社交网络中基于信任评估的推荐控制模型

Random forest based online topic detection using topic graph cluster

Gaussian generative model based topic detection using factor analysis

一种社交网络群组间信息推荐的有效方法

基于稀疏重构的判别分析

semantic social service rganization mechansim in cyber physical system

A novel multiple-level trust management framework for wireless sensor networks

Trust computation for multiple routes recommendation in social network sites

Recommendation Trust Chain and its Measurement in Social Network Site

一种社交网络中的个体间推荐信任度计算方法

Semi-supervised Sparsity Pairwise Constraint Preserving Projections based on GA

基于本体的决策问题语义理解及精炼方法

一种社会网络服务协作决策的竞标组织方法

基于主题本体树的文本流层次主题检测技术

基于时空π-演算的信息物理融合系统组件可替换性判定

社会网络中基于信任链的主题群组发现算法

一种新的中文词语情感极性判别方法

一种社交网络用户领导者挖掘算法

社交网络中基于信任评估的推荐控制模型

一种社交网络群组间信息推荐的有效方法

<span style="font-family:"font-size:10.5pt;"> A Novel Public Opinion Mining Met

Efficient Optimization for L-extSKY Recommendations

iHMM-based Topic Life Cycle Research on Dimension Reduction in Text Stream

<span style="font-family:Helvetica, 'font-size:14px;line-height:23.3240013122559px;&

<span style="font-family:"font-size:10.5pt;">Belief and Reputation Based Recomm

基于本体的决策问题语义理解及精炼方法

A novel capacity and trust based service selection mechanism for collaborative decision making in CP

社交网络中的用户信任链形式化模型

<span style="font-family:" font-size:10.5pt;"="">Recommendation Trust

基于MapReduce的并行PageRank算法实现

A novel multiple-level trust management for wireless sensor networks<br />

一种社会网络服务协作决策的竞标组织方法

一种社会网络中基于信任链的主题群组发现算法<br />

<span style="font-family:微软雅黑;font-size:14px;line-height:21px;background-color:#FFFFFF;"

Research on Online Topic Evolution Pattern Mining in Text Streams

基于主题本体树的文本流层次主题检测技术

LDA-based online topic detection using tensor factorization

基于信任评估的信息形式化推荐方法

社会网络中基于信任链的主题群组发现算法

基于排序学习的推荐算法研究综述

一种新的中文词语情感极性判别方法

一种分布式网络中轮廓推荐的有效方法

期刊信息

《计算机技术与发展》
中国科技核心期刊

主管单位:陕西省工业和信息化厅
主办单位:陕西省计算机学会
主编：王守智
地址：西安市雁塔路南段99号
邮编：710054
邮箱：ctad@vip.163.com
电话：029-85522163

国际标准刊号：ISSN：1673-629X
国内统一刊号：ISSN：61-1450/TP
邮发代号:52-127

获奖情况:
《CAJ-CD规范》执行优秀期刊

国内外数据库收录:
中国中国科技核心期刊

被引量:21263