ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

Research on optimization method of convolutional neural network based on visualization

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2020.07.013
  • Received Date: 24 May 2020
  • Accepted Date: 24 June 2020
  • Rev Recd Date: 24 June 2020
  • Publish Date: 31 July 2020
  • With the lifting force computer calculation, the application range of the depth of learning more and more widely. However, the design and tuning of deep learning models is very difficult. For complex models, adjusting only one layer of the network may lead to very different results. Many researchers usually adjust their parameters based on past experience, make a lot of trial and error, and wasting a lot of time and energy. Based on the data characteristics of the convolutional neural network model, this paper proposes a method of auxiliary parameter adjustment based on visualization. Analyze the internal data of the convolutional neural network by visualization and analyze the information represented by it, so as to quickly locate the model fault, realize targeted parameter adjustment, reduce the difficulty of researchers in parameter adjustment, and improve work efficiency.
    With the lifting force computer calculation, the application range of the depth of learning more and more widely. However, the design and tuning of deep learning models is very difficult. For complex models, adjusting only one layer of the network may lead to very different results. Many researchers usually adjust their parameters based on past experience, make a lot of trial and error, and wasting a lot of time and energy. Based on the data characteristics of the convolutional neural network model, this paper proposes a method of auxiliary parameter adjustment based on visualization. Analyze the internal data of the convolutional neural network by visualization and analyze the information represented by it, so as to quickly locate the model fault, realize targeted parameter adjustment, reduce the difficulty of researchers in parameter adjustment, and improve work efficiency.
  • loading
  • [1]
    GOODFELLOW I, BENGIO Y, COURVILLE A. Deep Learning[M]. MIT press, 2016.
    [2]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. 2012: 1097-1105.
    [3]
    HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 558-567.
    [4]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. 2012: 1097-1105.
    [5]
    ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [C]// Computer Vision - ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8689. Springer, Cham, 2014: 818-833.
    [6]
    YOSINSKI J, CLUNE J, NGUYEN A, et al. Understanding neural networks through deep visualization[EB/OL].(2015-06-22)[2020-04-24]. https://arxiv.org/abs/1506.06579.
    [7]
    LIU M, SHI J, LI Z, et al. Towards better analysis of deep convolutional neural networks[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 23(1): 91-100.
    [8]
    DUMOULIN V, VISIN F. A guide to convolution arithmetic for deep learning[EB/OL].(2018-01-11)[2020-04-24]. https://arxiv.org/abs/1603.07285.
    [9]
    SIBI P, JONES S A, SIDDARTH P. Analysis of different activation functions using back propagation neural networks[J]. Journal of Theoretical and Applied Information Technology, 2013, 47(3): 1264-1268.
    [10]
    IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on International Conference on Machine Learning ,2015: 448-456; doi: 10.5555/3045118.3045167.
    [11]
    YU D, WANG H, CHEN P, et al. Mixed pooling for convolutional neural networks[C]//International conference on rough sets and knowledge technology.Springer, Cham, 2014: 364-375.
    [12]
    DAHL G E, SAINATH T N, HINTON G E. Improving deep neural networks for LVCSR using rectified linear units and dropout[C]//2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013: 8609-8613.
    [13]
    SPENCE R. Information Visualization[M]. New York: Addison-Wesley, 2001.
    [14]
    ABADI M, BARHAM P, CHEN J, et al. Tensorflow: A system for large-scale machine learning[C]//12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’ 16). 2016: 265-283.
    [15]
    LIU Z, LUO P, WANG X, et al. Large-scale celebfaces attributes (celeba) dataset[J]. Retrieved August, 2018, 15: 2018.
    [16]
    JOHNSON S C. Hierarchical clustering schemes[J]. Psychometrika, 1967, 32(3): 241-254.
    [17]
    BOTEV Z I, GROTOWSKI J F, KROESE D P. Kernel density estimation via diffusion[J]. The Annals of Statistics, 2010, 38(5): 2916-2957.
    [18]
    NGUYEN H V, BAI L. Cosine similarity metric learning for face verification[C]//Asian conference on computer vision.Springer, Berlin, Heidelberg, 2010: 709-720.
    [19]
    RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. 2015 arXiv:1511.06434.)
  • 加载中

Catalog

    [1]
    GOODFELLOW I, BENGIO Y, COURVILLE A. Deep Learning[M]. MIT press, 2016.
    [2]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. 2012: 1097-1105.
    [3]
    HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 558-567.
    [4]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems. 2012: 1097-1105.
    [5]
    ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks [C]// Computer Vision - ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8689. Springer, Cham, 2014: 818-833.
    [6]
    YOSINSKI J, CLUNE J, NGUYEN A, et al. Understanding neural networks through deep visualization[EB/OL].(2015-06-22)[2020-04-24]. https://arxiv.org/abs/1506.06579.
    [7]
    LIU M, SHI J, LI Z, et al. Towards better analysis of deep convolutional neural networks[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 23(1): 91-100.
    [8]
    DUMOULIN V, VISIN F. A guide to convolution arithmetic for deep learning[EB/OL].(2018-01-11)[2020-04-24]. https://arxiv.org/abs/1603.07285.
    [9]
    SIBI P, JONES S A, SIDDARTH P. Analysis of different activation functions using back propagation neural networks[J]. Journal of Theoretical and Applied Information Technology, 2013, 47(3): 1264-1268.
    [10]
    IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on International Conference on Machine Learning ,2015: 448-456; doi: 10.5555/3045118.3045167.
    [11]
    YU D, WANG H, CHEN P, et al. Mixed pooling for convolutional neural networks[C]//International conference on rough sets and knowledge technology.Springer, Cham, 2014: 364-375.
    [12]
    DAHL G E, SAINATH T N, HINTON G E. Improving deep neural networks for LVCSR using rectified linear units and dropout[C]//2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013: 8609-8613.
    [13]
    SPENCE R. Information Visualization[M]. New York: Addison-Wesley, 2001.
    [14]
    ABADI M, BARHAM P, CHEN J, et al. Tensorflow: A system for large-scale machine learning[C]//12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’ 16). 2016: 265-283.
    [15]
    LIU Z, LUO P, WANG X, et al. Large-scale celebfaces attributes (celeba) dataset[J]. Retrieved August, 2018, 15: 2018.
    [16]
    JOHNSON S C. Hierarchical clustering schemes[J]. Psychometrika, 1967, 32(3): 241-254.
    [17]
    BOTEV Z I, GROTOWSKI J F, KROESE D P. Kernel density estimation via diffusion[J]. The Annals of Statistics, 2010, 38(5): 2916-2957.
    [18]
    NGUYEN H V, BAI L. Cosine similarity metric learning for face verification[C]//Asian conference on computer vision.Springer, Berlin, Heidelberg, 2010: 709-720.
    [19]
    RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. 2015 arXiv:1511.06434.)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return