Volume 43 Issue 1
Sep 2022
Turn off MathJax
Article Contents
LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 22-27. doi: 10.11936/bjutxb2016060070
Citation: LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 22-27. doi: 10.11936/bjutxb2016060070

PCA Shuffling Initialization of Convolutional Neural Networks

doi: 10.11936/bjutxb2016060070
  • Received Date: 23 Jun 2016
    Available Online: 09 Sep 2022
  • Issue Publish Date: 01 Jan 2017
  • To initialize convolutional neural networks better, an effective method named principal component analysis (PCA) Shuffling initialization was proposed. The method consisted of three steps. First, for the first convolutional layer, all receptive field of each feature map on training set was sampled. Then, principal component analysis of image patches separately for each feature map was conducted, and projection matrix was used to initialize filter of first convolutional layer. Finally, the first two steps on the other convolutional layers layer-wisely were performed. Experimental results on MNIST and CIFAR-10 dataset show that the proposed initialization has advantages of accuracy and speed of convergence compared to the common method such as random initialization and Xavier initialization.

     

  • loading
  • [1]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2): 2012.
    [2]
    THIMM G, FIESLER E.Neural network initialization[C]//International Workshop on Artificial Neural Networks: From Natural To Artificial Neural Computation. Berlin: Springer-Verlag, 1995: 535-542.
    [3]
    GLOROT X, BENGIO Y.Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 2010, 9: 249-256.
    [4]
    BENGIO Y.Practical recommendations for gradient-based training of deep architectures[J]. Journal of Non-Crystalline Solids, 2012, 71(1/2/3): 133-144.
    [5]
    ABDI H, WILLIAMS L J.Principal component analysis[J]. Wiley Interdisciplinary Reviews Computational Statistics, 2010, 2(4): 433-459.
    [6]
    SHLENS J.A tutorial on principal component analysis[J]. Eprint Arxiv, 2014, 58(3): 219-226.
    [7]
    JIA Y, SHELHAMER E, DONAHUE J, et al.Caffe: convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia. New York: ACM, 2014: 675-678.
    [8]
    LECUN Y.MNIST[DS/OL]. [2016-06-23].http:∥yann. lecun.com/exdb/mnist/.
    [9]
    LECUNYBOTTOULBENGIOY.Granient-based learning applied to document recognitionProceedings of IEEE1988861122782324

    LECUN Y, BOTTOU L, BENGIO Y.Granient-based learning applied to document recognition[J]. Proceedings of IEEE, 1988, 86(11): 2278-2324.

    [10]
    KRIZHEVSKY A.Learning multiple layers of features from tiny images[D]. MSc thesis, Toronto: University of Toronto, 2009.
    [11]
    KRIZHEVSKY A.Cuda-convnet[CP/OL]. [2016-06-23]. https://code.google.com/p/cuda-convnet/source/browse/trunk/example-layers/layers-80sec.cfg. 2012. https://code.google.com/p/cuda-convnet/source/browse/trunk/example-layers/layers-80sec.cfg
  • 加载中

Catalog

    Figures(9)

    Article Metrics

    Article views(96) PDF downloads(0) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return