为快速检测图片文字中的敏感词汇,引入深度学习的方法进行文字检测和识别。对图片预处理,对连通区域进行标记;利用两层限制玻尔兹曼机(RBM)对连通区域进行文字区域的判别和选取;利用水平投影和区域生长的方法对得到的文字区域进行字符的分割;用BP神经网络算法和深信度网络(DBN)算法结合对敏感信息进行检测。敏感文字检测理论分析和实验数据表明该方法的算法复杂度低,检测速度快。
In order to detect the sensitive words in the images fast, the means of deep learning is introduced to detect and recognize the words. It pre-processes the images and labels the connected region, uses two Restricted Boltzmann Machine(RBM)to judge and selects the text area in the connected region. The horizontal projection and the regional generation is used to segment character of the acquired text area. The BP neural network algorithm combines with the Deep Belief Network(DBN)algorithm to detect the sensitive information. The analysis of the detection theory of sensitive words and experiment data show that the new algorithm has low complexity and fast detection speed.