基于注意力学习的多智能体路径规划协作策略

马金超; 连德富

doi:10.52396/JUSTC-2022-0048

基于注意力学习的多智能体路径规划协作策略

Learning attention-based strategies to cooperate for multi-agent path finding

摘要

摘要: 多智能体路径发现 (MAPF) 是一个具有挑战性和有意义的问题，其中要求所有智能体在不相互碰撞和避免障碍的情况下有效地达到其目标地点。在多智能体路径发现中，有效地提取和表达智能体的观察结果、有效地利用历史信息以及与相邻智能体之间的有效通信是完成多智能体协作任务所面临的挑战。为了解决这些问题，本文提出了一个精心设计的模型，利用附近智能体的局部状态，并为每个智能体输出一个最优的执行动作。我们通过使用残差注意力卷积神经网络（residual attention CNN）来提取局部观测构建局部观测编码器，并使用Transformer架构来构建交互层来组合智能体的局部观测。为了提高该模型的成功率。我们还引入了一种新的指标，即额外时间比率（ETR）。实验结果表明，我们提出的模型在成功率和ETR方面优于以前的大多数模型。此外，我们还完成了对模型的消融研究，并证明了模型各组成部分的有效性。

Abstract: Multi-agent path finding (MAPF) is a challenging multi-agent systems problem where all agents are required to effectively reach their goals concurrently with not colliding with each other and avoiding obstacles. In MAPF, it is a challenge to effectively express the observation of agents, utilize historical information, and effectively communicate with neighbor agents. To tackle these issues, in this work, we proposed a well-designed model that utilizes the local states of nearby agents and outputs an optimal action for each agent to execute. We build the local observation encoder by using residual attention CNN to extract local observations and use the Transformer architecture to build an interaction layer to combine local observations of agents. With the purpose of overcoming the deficiency of success rate, we also designed a new evaluation index, namely extra time rate (ETR). The experimental results show that our model is superior to most previous models in terms of success rate and ETR. In addition, we also completed the ablation study on the model, and the effectiveness of each component of the model was proved.

HTML全文

参考文献(38)

施引文献

资源附件(1)