针对图像在平移、旋转或局部形变等复杂情况下的识别问题,提出一种基于非监督预训练和多尺度分块的卷积神经网络(CNN)目标识别算法。算法首先利用不含标签的图像训练一个稀疏自动编码器,得到符合数据集特性、有较好初始值的滤波器集合。为了增强鲁棒性,同时减小下采样对特征提取的影响,提出一种多通路结构的卷积神经网络,对输入图像进行多尺度分块形成多个通路,每个通路与相应尺寸的滤波器卷积,不同通路的特征经过局部对比度标准化和下采样后在全连接层进行融合,从而形成最终用于图像分类的特征,将特征输入分类器完成图像目标识别。仿真实验中,所提算法对STL-10数据集和遥感飞机图像的识别率较传统的CNN均有提高,并对图像各种形变具有较好的鲁棒性。
The deformation such as translation,rotation and random scaling of local images in image recognition tasks is a complicated problem. An algorithm based on pre-training convolutional filters and Multi-Scale block Convolutional Neural Network( MS-CNN) was proposed to solve these problems. Firstly,the training dataset without labels was used to train a sparse autoencoder and get a collection of convolutional filters with characteristics in accord with the dataset and good initial values. To enhance the robustness and reduce the impact of the pooling layer for the feature extraction,a new Convolutional Neural Network( CNN) structure with multiple channels was proposed. The multi-scale block operation was applied to input image to form several channels,and each channel was convolved with corresponding size of filter. Then the convolutional layer,a local contrast normalization layer and a pooling layer were set to obtain invariability. The feature maps were put in the full connected layer and final features were exported for target recognition. The recognition rates of STL-10 database and remote sensing airplane images were both improved compared to traditional CNN. The experimental results show that the proposed method has robust performance when dealing with deformations such as translation,rotation and scaling.