ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

Robot control policy transfer based on progressive neural network

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2019.10.006
  • Received Date: 24 December 2018
  • Accepted Date: 16 May 2019
  • Rev Recd Date: 16 May 2019
  • Publish Date: 31 October 2019
  • In the field of robotic control, it is appealing to solve complicated control tasks through deep learning techniques. However, collecting enough robot operating data to train deep learning models is difficult. Thus, in this paper a transfer approach based on progressive neural network (PNN) and deep deterministic policy gradient (DDPG) is proposed. By linking the current task model and pretrained task models in the model pool with a novel structure, the control strategy in the pretrained task models is transferred to the current task model. Simulation experiments validate that, the proposed approach can successfully transfer control policies learned from the source task to the current task. And compared with other baselines, the proposed approach takes remarkably less time to achieve the same performance in all the experiments.
    In the field of robotic control, it is appealing to solve complicated control tasks through deep learning techniques. However, collecting enough robot operating data to train deep learning models is difficult. Thus, in this paper a transfer approach based on progressive neural network (PNN) and deep deterministic policy gradient (DDPG) is proposed. By linking the current task model and pretrained task models in the model pool with a novel structure, the control strategy in the pretrained task models is transferred to the current task model. Simulation experiments validate that, the proposed approach can successfully transfer control policies learned from the source task to the current task. And compared with other baselines, the proposed approach takes remarkably less time to achieve the same performance in all the experiments.
  • loading
  • [1]
    LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies[J]. The Journal of Machine Learning Research, 2016, 17(1): 1334-1373.
    [2]
    MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. Computer Science, 2013,arXiv:1312.5602.
    [3]
    MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529.
    [4]
    RABINOWITZ N C, DESJARDINS G, RUSU A A, et al. Progressive neural networks: U.S. Patent Application 15/396,319[P]. 2017-11-23.
    [5]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012,25(2): 1097-1105.
    [6]
    SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer Science, 2014, arXiv preprint arXiv:1409.1556.
    [7]
    SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 1-9.
    [8]
    HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE, 2016: 770-778.
    [9]
    FINN C, LEVINE S. Deep visual foresight for planning robot motion[C]//IEEE International Conference on Robotics and Automation. Ningbo, China: IEEE, 2017: 2786-2793.
    [10]
    YAHYA A, LI A, KALAKRISHNAN M, et al. Collective robot reinforcement learning with distributed asynchronous guided policy search[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver, Canada:IEEE, 2017: 79-86.
    [11]
    MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International conference on machine learning. New York, USA: IEEE, 2016: 1928-1937.
    [12]
    LILLICRAP T P , HUNT J J , PRITZEL A , et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8(6):A187.
    [13]
    SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. MIT press, 2018.
    [14]
    SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International Conference on Machine Learning. Beijing, China: IEEE, 2014: 387-395.
  • 加载中

Catalog

    [1]
    LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies[J]. The Journal of Machine Learning Research, 2016, 17(1): 1334-1373.
    [2]
    MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with deep reinforcement learning[J]. Computer Science, 2013,arXiv:1312.5602.
    [3]
    MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529.
    [4]
    RABINOWITZ N C, DESJARDINS G, RUSU A A, et al. Progressive neural networks: U.S. Patent Application 15/396,319[P]. 2017-11-23.
    [5]
    KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012,25(2): 1097-1105.
    [6]
    SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer Science, 2014, arXiv preprint arXiv:1409.1556.
    [7]
    SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Boston, USA: IEEE, 2015: 1-9.
    [8]
    HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Las Vegas, USA: IEEE, 2016: 770-778.
    [9]
    FINN C, LEVINE S. Deep visual foresight for planning robot motion[C]//IEEE International Conference on Robotics and Automation. Ningbo, China: IEEE, 2017: 2786-2793.
    [10]
    YAHYA A, LI A, KALAKRISHNAN M, et al. Collective robot reinforcement learning with distributed asynchronous guided policy search[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver, Canada:IEEE, 2017: 79-86.
    [11]
    MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International conference on machine learning. New York, USA: IEEE, 2016: 1928-1937.
    [12]
    LILLICRAP T P , HUNT J J , PRITZEL A , et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8(6):A187.
    [13]
    SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. MIT press, 2018.
    [14]
    SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//International Conference on Machine Learning. Beijing, China: IEEE, 2014: 387-395.

    Article Metrics

    Article views (64) PDF downloads(119)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return