Multi-agent path planning based on improved double DQN

Chen Zhang; Wenying Jiang; Siyuan Chen; Wen Zhou; Fengting Yan

Journal ArticleOPEN ACCESS

Multi-agent path planning based on improved double DQN

Journal of Image and Graphics (2023) 28(7) 2167-2181

DOI: 10.11834/jig.211239

3Citations

6Readers

Abstract

Objective Rescue-oriented evacuation drills like fire escape drills have often been structured to optimize rehearsal training effect and firefighting awareness. To get sufficient evacuation experience，multiple drills are costly for related organizers. The requirement of that is based on evacuation drills，emergency drill venue，the physical condition of participants，and position information in real-time. The emerging virtual reality technology can be used to guide virtual fire escape in relevance to lower cost and risk and higher reliability. Moreover，to simulate its emergency drills in virtual scenarios，multi-agent path planning has been recognized and developed nowadays. Method We develop an improved double deep Q network（DQN）framework. Specifically，this virtual scenario analysis is developed through collecting enough campus information，including multiple agents，obstacles，exits，fire affected areas，and other related factors. Since all agents are assumed on the same plane，we can convert them into two-dimensional grid diagrams via transformation gridding and coordination. Furthermore，different grids are colored and utilized in two-dimensional grid plane m to represent obstacles，fire affected areas，exits and locations of agents. According to the location of the agent in the virtual scene，the grid plane m is layered，and the grid plane m1 and the grid plane m2 can be obtained in terms of the sizes of 64 × 100 and 48 × 100 of each. In the double deep Q network，we use two double Q networks with the same structure，i. e. ，Q1 and Q2，which consists of two category of convolution and full connection layers. Furthermore，input size can be interlinked to the grid planes with the same size as m1 and m2 after environmental stratification. For the grid planes with the same size as m1 and m2，trainable grid planes m'1t and m'2t can be obtained by randomly assigning the same number of black blocks with size of 1×1 to represent the duplicable location of the obstacle，and generating planes corresponding to all different starting positions to represent all status of the agent in the scene，which are used to initialize experience pools D1 and D2 and train networks Q1 and Q2. For the actual evacuation drills，the evacuation of the crowd is not completely independent and discrete. Nevertheless，due to the sociality of people，there is a certain social relationship between the people involved in evacuation，and there is often a certain phenomenon of“gathering and following”in crowd evacuation. In addition，to achieve the evacuation process of the crowd better in an actual evacuation drill，the organizer often arrange a certain number of guiders at different locations to assist the participants to complete the process of evacuation. Hence，our framework can add this guide into the virtual scenario and an improved k-medoids algorithm based multi-agent grouping strategy method is implemented. Agent-based location and relationship are involved in and the related grouping of the agents are accomplished as well ，i. e. ，the selection of corresponding guiding agents，and the evacuation-led of other agents in the group，and the improved path planning algorithm of double deep Q network architecture mentioned above. A reliability and efficiency of evacuation are improved further. Result Extensive experiment is carried out to validate our proposed methods. In the training process，the network Q3 of the traditional DQN method converge 24 000 batch sizes，while the Q1 and Q2 networks converge about 3 000 batch size as well. In detail，it demonstrates that the convergence performance of proposed method is significantly faster than the traditional DQN method and more stable. Additionally，to improve the evacuation efficiency and evacuation safety of the agent in fire scenarios，average health evacuation value（AHEP）is used to evaluate the evacuation effect. In AHEP criterion，it is about 84% and 104% higher than each traditional path planning methods of A-STAR，DIJKSTRA. Compared to the extended A-STAR and Dijkstra-ACO hybrid algorithm based on changeable fire scene，hybrid algorithm can be improved by 30% and 21%；Compared to DQN algorithm，it can be reached 20% higher. What is more，evacuation efficiency and safety are improved more，and evacuation effect of the planned path is much better. Furthermore，to verify the evacuation effect under different groups，we compared the AHEP values under the four groups of 4，5，6 and 7. When the group is 6，its value is the highest，which is 17%，13% and 6% higher than those three cases of 4，5 and 7. Finally，the results show that the appropriate grouping of multi-agent can improve the evacuation efficiency of agent. Conclusion The proposed method has its potentials to improve the evacuation efficiency and security to a certain extent.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, C., Jiang, W., Chen, S., Zhou, W., & Yan, F. (2023). Multi-agent path planning based on improved double DQN. Journal of Image and Graphics, 28(7), 2167–2181. https://doi.org/10.11834/jig.211239

Multi-agent path planning based on improved double DQN

Abstract

Author supplied keywords

Cite

Register to see more suggestions