留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

深层感知器结构设计的逐层主成分分析方法

李玉鑑 杨红丽 刘兆英

李玉鑑, 杨红丽, 刘兆英. 深层感知器结构设计的逐层主成分分析方法[J]. 机械工程学报, 2017, 43(2): 230-236. doi: 10.11936/bjutxb2016040024
引用本文: 李玉鑑, 杨红丽, 刘兆英. 深层感知器结构设计的逐层主成分分析方法[J]. 机械工程学报, 2017, 43(2): 230-236. doi: 10.11936/bjutxb2016040024
LI Yujian, YANG Hongli, LIU Zhaoying. Deep Perception Structure Design Via Layer-wise Principal Component Analysis[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(2): 230-236. doi: 10.11936/bjutxb2016040024
Citation: LI Yujian, YANG Hongli, LIU Zhaoying. Deep Perception Structure Design Via Layer-wise Principal Component Analysis[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(2): 230-236. doi: 10.11936/bjutxb2016040024

深层感知器结构设计的逐层主成分分析方法

doi: 10.11936/bjutxb2016040024
基金项目: 国家自然科学基金资助项目(61175004);高等学校博士学科点专项科研资助项目(20121103110029);中国博士后科学基金资助项目(2015M580952)
详细信息
    作者简介:

    作者简介: 李玉鑑(1968—), 男, 教授, 主要从事模式识别、图像处理、机器学习、数据挖掘方面的研究, E-mail:liyujian@bjut.edu.cn

  • 中图分类号: TP391

Deep Perception Structure Design Via Layer-wise Principal Component Analysis

  • 摘要: 为了解决深层感知器的结构设计问题,提出了一种逐层主成分分析方法. 该方法根据训练数据集的分布特点,在适当控制信息损失的条件下,可以有效地确定每层神经元的个数. 首先,依据样本维数和标签类数分别确定输入层和输出层神经元的个数;然后,对训练样本集进行主成分分析,利用降维后的维数确定第2层神经元的个数;最后,在确定其他层神经元的个数时,将上一次降维后的样本经过非线性激活函数作用,再进行主成分分析,得到降维后的样本维数即为该层神经元的个数. 在MNIST手写字数据集上的实验结果表明:该方法有助于简化深层感知器的结构,在减少参数个数、缩短收敛时间和降低训练难度等方面均具有优越性.

     

  • 图  深层感知器结构图

    Figure  1.  Deep perception structure

    图  GLPCA确定的不同层数网络结构及相应的训练和测试错误率

    Figure  2.  Training and test error of networks with various structures designed by GLPCA

    表  1  对比实验数据和结果

    Table  1.   Data and results of comparison experiments

    实验 Hinton实验 逐层主成分分析实验
    网络层数 5 5 6
    网络结构 784-500-500-2000-10 784-388-352-325-10 784-388-352-325-302-10
    神经元总个数 3794 1859 2161
    参数个数 1.67×106 5.59×105 6.58×105
    收敛时间/h(相同机器训练) 10.218 2.121 2.300
    测试集错误率/% 1.20、1.14 1.15 1.09
    ①在文献[1]中,Hiton实验的测试集错误率达到1.20%;②在文献[45]中,微调时通过对整个网络使用共轭梯度下降算法使错误率达到1.14%.
    下载: 导出CSV

    表  2  实验相关数据及结果

    Table  2.   Experiment data and results

    网络层数 深层感知器结构 训练错误率达到0 训练网络收敛 测试错误率/%
    迭代次数 时间/h 迭代次数 时间/h 网络收敛 迭代过程中最低
    3 784-388-10 32 0.497 129 2.514 1.58 1.51
    4 784-388-352-10 39 0.538 105 2.132 1.39 1.29
    5 784-388-352-325-10 28 0.463 63 2.121 1.15 1.11
    6 784-388-352-325-302-10 27 0.724 54 2.300 1.09 1.06
    7 784-388-352-325-302-282-10 28 1.074 48 2.415 1.15 1.14
    8 784-388-352-325-302-282-264-10 25 2.303 48 4.834 1.19 1.15
    下载: 导出CSV
  • [1] HINTON G E, SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507.
    [2] HINTON G E, OSINDERO S, TEH Y W.A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7): 1527-54.
    [3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2): 2012.
    [4] FARABET C, COUPRIE C, NAJMAN L, et al.Learning hierarchical features for scene labeling[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2013, 35(8): 1915-29.
    [5] HINTON G E, LI D, DONG Y, et al.Deep neural networks for acoustic modeling in speech recognition[J]. IEEE Signal Processing Magazine, 2012, 29(6): 82-97.
    [6] COLLOBERT R, WESTON J, BOTTOU L, et al.Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011, 12(1): 2493-2537.
    [7] MIKOLOV T, DEORAS A, POVEY D, et al.Strategies for training large scale neural network language models[C]//2011 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). Washington D C: IEEE, 2011: 196-201.
    [8] MAIRESSE F, YOUNG S.Stochastic language generation in dialogue using factored language models[J]. Computational Linguistics, 2014, 40(4): 763-799.
    [9] SARIKAYA K, HINTON G E, DEORAS A.Application of deep belief networks for natural language understanding[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2014, 22(4): 778-784.
    [10] NOBLE W S.What is a support vector machine?[J]. Nature Biotechnology, 2006, 24(12): 1565-1567.
    [11] CHAPELLE O.Training a support vector machine in the primal[J]. Neural Computation, 2007, 19(5): 1155-78.
    [12] SCHAPIRE R E.A brief introduction to boosting[J]. Ijcai, 2010, 14(2): 377-380.
    [13] PHILLIPS S J, ANDERSON R P, SCHAPIRE R E.Maximum entropy modeling of species geographic distributions[J]. Ecological Modelling, 2006, 190(3): 231-259.
    [14] KUSAKUNNIRAN W, WU Q, ZHANG J, et al.Cross-view and multi-view gait recognitions based on view transformation model using multi-layer perceptron[J]. Pattern Recognition Letters, 2012, 33(7): 882-889.
    [15] SUN K, HUANG S H, WONG S H, et al.Design and application of a variable selection method for multilayer perceptron neural network with LASSO[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 99: 1-11.
    [16] HINTON G E.Learning multiple layers of representation[J]. Trends in Cognitive Sciences, 2007, 11(11): 428-34.
    [17] SCHÖLKOPF B, PLATT J, HOFMANN T. Greedy layer-wise training of deep networks[J]. Advances in Neural Information Processing Systems, 2007, 19: 153-160.
    [18] HÅSTAD J, GOLDMANN M. On the power of small-depth threshold circuits[J]. Computational Complexity, 1990, 1(2): 610-618
    [19] KOLEN J, KREMER S.Gradiant flow in recurrent nets: the difficulty of learning long-term dependencies[J]. 2003, 28(2): 237-243.
    [20] GEORGE B, MICHAEL G.Feed-forward neural networks: Why network size is so important[J]. IEEE Potentials, 1994, 13(4): 27-31.
    [21] REED R.Pruning algorithms-a survey[J]. IEEE Transactions on Neural Networks, 1993, 4(5): 740-747.
    [22] YANG Z J, SHI Z K.Architecture optimization for neural networks[J]. Computer Engineering and Applications, 2004, 40(25): 52-54.(in Chinese)
    [23] COSTA MABRAGA APDE MENEZES B R. Constructive and pruning methods for neural network designVII Brazilian Symposium on Neural Networks, 2002. Washington D C: IEEE20024954

    COSTA M A, BRAGA A P, DE MENEZES B R. Constructive and pruning methods for neural network design[C]//VII Brazilian Symposium on Neural Networks, 2002. Washington D C: IEEE, 2002: 49-54.

    [24] STATHAKIS D.How many hidden layers and nodes?[J]. International Journal of Remote Sensing, 2009, 30(8): 2133-2147.
    [25] MOZER M C, SMOLENSKY P.Skeletonization: a technique for trimming the fat from a network via relevance assessment[C]//Neural Information Processing Systems 1. San Francisco: Morgan Kaufmann Publishers Inc, 1989: 107-115.
    [26] CUN Y L, DENKER J S, SOLLA S A.Optimal brain damage[C]//Advances in neural information processing systems 2. San Francisco: Morgan Kaufmann Publishers Inc, 1990: 598-605.
    [27] MRAZOVA I, REITERMANOVA Z.A new sensitivity-based pruning technique for feed-forward neural networks that improves generalization[C]//International Joint Conference on Neural Networks. Washington D C: IEEE, 2011: 2143-2150.
    [28] HINAMOTO T, HAMANAKA T, MAEKAWA S, et al.Generalizing smoothness constraints from discrete samples[J]. Neural Computation, 2008, 2(2): 188-197.
    [29] HUBERMAN B B, WEIGEND A, RUMELHART D.Back-propagation, weight-elimination and time series prediction[C]//Proc of 1990 Connectionist Models Summer School. San Francisco: Morgan Kaufmann Publishers Inc, 1990: 105-116.
    [30] SIETSMA J, DOW R J F. Creating artificial neural networks that generalize[J]. Neural Networks, 1991, 4(1): 67-79.
    [31] SIETSMA J, DOW R J F. Neural net pruning-why and how[C]//IEEE International Conference on Neural Networks, 1988. Washington D C: IEEE, 1988: 325-333.
    [32] LEUNG F F, LAM H K, LING S H, et al.Tuning of the structure and parameters of a neural network using an improved genetic algorithm[J]. IEEE Transactions on Neural Networks, 2003, 14(1): 79-88.
    [33] YEUNG D S, ZENG X Q.Hidden neuron pruning for multilayer perceptrons using a sensitivity measure[C]//International Conference on Machine Learning and Cybernetics, 2002. Washington D C: IEEE, 2002: 1751-1757.
    [34] FNAIECH F, FNAIECH N, NAJIM M.A new feedforward neural network hidden layer neuron pruning algorithm[C]//IEEE International Conference on Acoustics. Washington D C: IEEE, 2001: 1277-1280.
    [35] PUKRITTAYAKAMEE A, HAGAN M, RAFF L, et al.A network pruning algorithm for combined function and derivative approximation[C]//IEEE-INNS-ENNS International Joint Conference on Neural Networks Neural Networks. Washington D C: IEEE, 2009: 2553-2560.
    [36] SHARMA S K, CHANDRA P.Constructive neural networks: a review[J]. International Journal of Engineering Science & Technology, 2010, 2(12): 7847-7855.
    [37] FREAN M.The upstart algorithm: a method for constructing and training feedforward neural networks[J]. Neural Computation, 2008, 2(2): 198-209.
    [38] FAHLMAN S E, LEBIERE C.The cascade-correlation learning architecture[J]. Advances in Neural Information Processing Systems, 1997, 2(6): 524-532.
    [39] TSOI A C, HAGENBUCHNER M, MICHELI A.Building MLP networks by construction[C]//IEEE-INNS-ENNS International Joint Conference on Neural Networks. Washington D C: IEEE, 2000: 4549-4549.
    [40] ISLAM M M, SATTAR M A, AMIN M F, et al.A new adaptive merging and growing algorithm for designing artificial neural networks[J]. IEEE Transactions on Systems Man & Cybernetics Part B Cybernetics, 2009, 39(3): 705-722.
    [41] SRIDHAR S S, PONNAVAIKKO M.Improved adaptive learning algorithm for constructive neural networks[J]. International Journal of Computer and Electrical Engineering, 2011, 3(1): 30-36.
    [42] FAN J N, WANG Z L, QIAN F.Reseacrh progress structural design of hidden layer in BP artificial neural networks[J].(in Chinese)
    [43] ABDI H, WILLIAMS L J.Principal component analysis[J]. Wiley Interdisciplinary Reviews Computational Statistics, 2010, 2(4): 433-459.
    [44] SHLENS J.A tutorial on principal component analysis[J]. Eprint Arxiv, 2014, 58(3): 219-226.
    [45] YANG J, ZHANG D, FRANGI A F, et al.Two-dimensional PCA: a new approach to appearance-based face representation and recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2004, 26(1): 131-7.
    [46] LÉCUN Y, CORTES C, BURGES C J C. The MNIST database of handwritten digits[EB/OL].[2016-06-10]. http://yann.lecun.com/exdb/mnist. http://yann.lecun.com/exdb/mnist
    [47] HINTON G E, SALAKHUTDINOV R R.Supporting online material for reducing the dimensionality of data with neural networks[J].Science, 2006(28): 313-504.
  • 加载中
图(2) / 表(2)
计量
  • 文章访问数:  162
  • HTML全文浏览量:  32
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-04-08
  • 网络出版日期:  2022-09-13
  • 刊出日期:  2017-02-01

目录

    /

    返回文章
    返回