TY - GEN
T1 - Adaptive multi-robot team reconfiguration using a policy-reuse reinforcement learning approach
AU - Dasgupta, Prithviraj
AU - Cheng, Ke
AU - Banerjee, Bikramjit
PY - 2012
Y1 - 2012
N2 - We consider the problem of dynamically adjusting the formation and size of robot teams performing distributed area coverage, when they encounter obstacles or occlusions along their path. Based on our earlier formulation of the robotic team formation problem as a coalitional game called a weighted voting game (WVG), we show that the robot team size can be dynamically adapted by adjusting the WVG's quota parameter. We use a Q-learning algorithm to learn the value of the quota parameter and a policy reuse mechanism to adapt the learning process to changes in the underlying environment. Experimental results using simulated e-puck robots within the Webots simulator show that our Q-learning algorithm converges within a finite number of steps in different types of environments. Using the learning algorithm also improves the performance of an area coverage application where multiple robot teams move in formation to explore an initially unknown environment by 5∈-∈10%.
AB - We consider the problem of dynamically adjusting the formation and size of robot teams performing distributed area coverage, when they encounter obstacles or occlusions along their path. Based on our earlier formulation of the robotic team formation problem as a coalitional game called a weighted voting game (WVG), we show that the robot team size can be dynamically adapted by adjusting the WVG's quota parameter. We use a Q-learning algorithm to learn the value of the quota parameter and a policy reuse mechanism to adapt the learning process to changes in the underlying environment. Experimental results using simulated e-puck robots within the Webots simulator show that our Q-learning algorithm converges within a finite number of steps in different types of environments. Using the learning algorithm also improves the performance of an area coverage application where multiple robot teams move in formation to explore an initially unknown environment by 5∈-∈10%.
KW - Q-learning
KW - coalition game
KW - multi-robot formation
UR - http://www.scopus.com/inward/record.url?scp=84855926230&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84855926230&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-27216-5_23
DO - 10.1007/978-3-642-27216-5_23
M3 - Conference contribution
AN - SCOPUS:84855926230
SN - 9783642272158
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 330
EP - 345
BT - Advanced Agent Technology - AAMAS 2011 Workshops, AMPLE, AOSE, ARMS, DOCM3AS, ITMAS, Revised Selected Papers
T2 - International Conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2011
Y2 - 2 May 2011 through 6 May 2011
ER -