东篱科研大数据发现系统（DRDS）

欢迎您！东篱公司退出

申报数据库
1. 申报指南
立项数据库
成果数据库
1. 期刊论文
2. 会议论文
3. 著作
4. 专利
项目获奖数据库

位置：成果数据库 > 期刊 > 期刊详情页

一种改进的KNN Web文本分类方法

ISSN号：1001-3695
期刊名称：《计算机应用研究》
时间：0
分类：TP311[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
作者机构：[1]江南大学信息工程学院,江苏无锡214122
相关基金：基金项目：国家自然科学基金资助项目（60773206）

作者：吴春颖[1], 王士同[1]

关键词： WEB文本分类, K最近邻, 快速分类, Web text classification, KNN（K-nearest neighbor） , fast classification

中文摘要：

KNN方法存在两个不足：a）计算量巨大,它要求计算未知文本与所有训练样本间的相似度进而得到k个最近邻样本;b）当类别间有较多共性,即训练样本间有较多特征交叉现象时,KNN分类的精度将下降。针对这两个问题,提出了一种改进的KNN方法,该方法先通过Rocchio分类快速得到k0个最有可能的候选类别;然后在k0个类别训练文档中抽取部分代表样本采用KNN算法;最后由一种改进的相似度计算方法决定最终的文本所属类别。实验表明,改进的KNN方法在Web文本分类中能够获得较好的分类效果。

英文摘要：

KNN method not only has large computational demands, because it must compute the similarity between unlabeled text and all training texts ; but also may decrease the precision of classification because of the commonness of classes. This paper presented an improved KNN method, which solved two problems mentioned above. It firstly got the most k0 classes fast by Rocchio method, and then used KNN arithmetic in some representative training texts of the classes, at last assigned class by an improved similar arithmetic in KNN. The result of research indicates that the impact of the new method is better.

同期刊论文项目

模糊形态联想记忆理论及在医学图像识别中的应用研究

期刊论文 80

同项目期刊论文

Divisive hierarchical clustering algorithm based on soft hyperspheric partition

Theoretical choice of the optimal threshold for possibilistic linear model with noisy input

A supervised locality preserving projection algorithm for dimensionality reduction

Glutathione fermentation process modeling based on CCTSK fuzzy neural network

Fuzzy fisher criterion based semi-fuzzy clustering algorithm

Hyper-ellipsoid support vector machine classifiers

Extended Mumford-Shah model integrated with fuzzy clustering

An enhanced possibilistic C-Means clustering algorithm EPCM

基于稀疏Parzen窗密度估计的快速自适应相似度聚类方法

支撑向量数据域描述优化问题最优解理论分析

基于最小包含球的大数据集快速谱聚类算法

基于混合距离学习的双指数模糊C均值算法

基于力的类同传播聚类方法

最小学习机

基于二次特征的模糊分类算法在普适计算上下文变化感知中的研究

基于无监督最佳鉴别平面的人脸识别

大数据集快速均值漂移谱聚类算法

基于语境距离度量的拉普拉斯最大间距判别准则

Ensemble Classifier based on Minimum Class Variance SVM and null space classifier

RBF network learning algorithm using robust least-squares

Transformation between type-2 TSK fuzzy systems and an uncertain Gaussian mixture model

Global and local preserving based semi-supervised support vector machine

Minimum variance support vector regression

Fast mean shift spectral clustering on large data sets

A novel rule-centered fuzzy model induced by Epanechnikov mixtures

Enhanced soft subspace clustering integrating within-cluster and between-cluster information

Double indices FCM algorithm based on hybrid distance metric learning

Double-indices fuzzy subspace clustering algorithm based on feature weighted distance

基于模糊最大散度差判别准则的聚类方法

Fermentation process TSK fuzzy modeling based on entropy criteria

A robust neuro-fuzzy network approach to impulse noise filtering for color images

基于改进模糊聚类算法鲁棒的图像分割

改进的基于空间模式聚类的图像分割

基于模糊聚类的稳健支撑向量回归机及火焰图像处理

极大熵Relief特征加权

基于熵准则的鲁棒的RBF谷胱甘肽发酵建模

Research on generalized fuzzy c-means clustering algorithm with improved fuzzy partitions

Fuzzy maximum scatter difference discriminant criterion based clustering algorithm

Generalized supervised locality preserving projection

Matrix pattern based minimum within-class scatter support vector machines

Fuzzy clustering algorithm with ranking features and identifying noise simultaneously

Equivalence between type-2 TSK fuzzy model and uncertain Gaussian mixture model

From minimum enclosing ball to fast fuzzy inference system training on large datasets

GPSFM: Generalized potential support features selection method

Feature extraction method on maximum margin criterion with locality preserving

Research of hydrodynamic parameter identification for underwater vehicle using swarm intelligence al

Advanced fuzzy cellular neural network: Application to CT liver images

Image segmentation using the enhanced possibilistic clustering method

Research on the dependency between optimal parameter and the input noise in possibilistic linear mod

Dependency between degree of fit and input noise in fuzzy linear regression using non-symmetric fuzz

A visual system theoretic cost criterion and its application to clustering and fuzzy modeling

面向大规模数据的隐私保护学习机

基于少量异常数据的最大间隔新奇检测方法

一种新的基于改进弹簧质点模型的图像边缘检测方法

基于expectation maximization算法的Mamdani-Larsen模糊系统及其在时间序列预测中的应用

改进的LDA算法及秩限制问题研究

有局部保持的最大间距准则特征提取方法

改进模糊划分的FCM聚类算法的一般化研究

基于Epanechnikov混合模型的中心化模糊模型

应用改进的弹簧质点模型进行图像滤波的算法

具有特征排序功能的鲁棒性模糊聚类方法

核化空间深度包围核的模糊决策异常检测算法

一种结合层次结构和KNN的Web文本分类方法

核化空间深度间距的特征提取方法

动态权值混合C-均值模糊核聚类算法

基于测地距离逼近的降维算法

自适应半监督模糊谱聚类算法

模糊规则自适应学习的弹性图像配准

基于矩阵模式的最小类内散度支持向量机

广义T-S模糊系统的非脆弱保代价控制

基于模糊分组和监督聚类的RBF回归性能改进

基于半监督学习的核信任力传播聚类算法

基于蚁群算法的分类规则挖掘

FKA算法迭代收敛性分析

基于边界的最大间隔模糊分类器

基于熵理论和核密度估计的最大间隔学习机

期刊信息

《计算机应用研究》
北大核心期刊（2011版）

主管单位:四川省科学技术厅
主办单位:四川省计算机研究院
主编：刘营
地址：成都市成科西路3号
邮编：610041
邮箱：arocmag@163.com
电话：028-85210177 85249567

国际标准刊号：ISSN：1001-3695
国内统一刊号：ISSN：51-1196/TP
邮发代号:62-68

获奖情况:
第二届国家期刊奖百种重点科技期刊,国内计算技术类重点核心期刊,国内外著名数据库收录期刊

国内外数据库收录:
俄罗斯文摘杂志,波兰哥白尼索引,英国科学文摘数据库,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:60049