LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 22-27. doi: 10.11936/bjutxb2016060070
Citation:
LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 22-27. doi: 10.11936/bjutxb2016060070
LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 22-27. doi: 10.11936/bjutxb2016060070
Citation:
LI Yujian, SHEN Chengkai, YANG Hongli, HU Haihe. PCA Shuffling Initialization of Convolutional Neural Networks[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 22-27. doi: 10.11936/bjutxb2016060070
Abstract:
To initialize convolutional neural networks better, an effective method named principal component analysis (PCA) Shuffling initialization was proposed. The method consisted of three steps. First, for the first convolutional layer, all receptive field of each feature map on training set was sampled. Then, principal component analysis of image patches separately for each feature map was conducted, and projection matrix was used to initialize filter of first convolutional layer. Finally, the first two steps on the other convolutional layers layer-wisely were performed. Experimental results on MNIST and CIFAR-10 dataset show that the proposed initialization has advantages of accuracy and speed of convergence compared to the common method such as random initialization and Xavier initialization.
KRIZHEVSKY A, SUTSKEVER I, HINTON G E.ImageNet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2): 2012.
[2]
THIMM G, FIESLER E.Neural network initialization[C]//International Workshop on Artificial Neural Networks: From Natural To Artificial Neural Computation. Berlin: Springer-Verlag, 1995: 535-542.
[3]
GLOROT X, BENGIO Y.Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 2010, 9: 249-256.
[4]
BENGIO Y.Practical recommendations for gradient-based training of deep architectures[J]. Journal of Non-Crystalline Solids, 2012, 71(1/2/3): 133-144.
[5]
ABDI H, WILLIAMS L J.Principal component analysis[J]. Wiley Interdisciplinary Reviews Computational Statistics, 2010, 2(4): 433-459.
[6]
SHLENS J.A tutorial on principal component analysis[J]. Eprint Arxiv, 2014, 58(3): 219-226.
[7]
JIA Y, SHELHAMER E, DONAHUE J, et al.Caffe: convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia. New York: ACM, 2014: 675-678.