TY - GEN
T1 - Improved reward estimation for efficient robot navigation using inverse reinforcement learning
AU - Saha, Olimpiya
AU - Dasgupta, Prithviraj
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/9/19
Y1 - 2017/9/19
N2 - Robot navigation is a central problem in extraterrestrial environments and a suitable navigation algorithm that allows the robot to quickly but precisely avoid initially unknown obstacles is important for efficient navigation. In this paper, we consider a well-known machine learning-based framework called reinforcement learning for robot navigation and investigate a technique for adaptively adjusting the rewards associated with robot maneuvers or actions within this framework. Most reinforcement learning techniques rely on hand-coded, simplistic reward functions which might not be able to determine the most appropriate actions for the robot when the robot is required to perform tasks with new features. To address this problem, we propose an algorithm called IRL-SMDPT (Inverse Reinforcement Learning in Semi Markov Decision Processes with Transfer) which utilizes an inverse reinforcement learning technique called Distance Minimization Inverse Reinforcement Learning (DM-IRL) to estimate an appropriate reward function so that a robot's navigation in complicated environments is improved. Our experimental results show that IRL-SMDPT can improve robot navigation by estimating rewards of trajectories more accurately in comparison to random and greedy reward variants and is also robust against small errors or noise in scoring trajectories.
AB - Robot navigation is a central problem in extraterrestrial environments and a suitable navigation algorithm that allows the robot to quickly but precisely avoid initially unknown obstacles is important for efficient navigation. In this paper, we consider a well-known machine learning-based framework called reinforcement learning for robot navigation and investigate a technique for adaptively adjusting the rewards associated with robot maneuvers or actions within this framework. Most reinforcement learning techniques rely on hand-coded, simplistic reward functions which might not be able to determine the most appropriate actions for the robot when the robot is required to perform tasks with new features. To address this problem, we propose an algorithm called IRL-SMDPT (Inverse Reinforcement Learning in Semi Markov Decision Processes with Transfer) which utilizes an inverse reinforcement learning technique called Distance Minimization Inverse Reinforcement Learning (DM-IRL) to estimate an appropriate reward function so that a robot's navigation in complicated environments is improved. Our experimental results show that IRL-SMDPT can improve robot navigation by estimating rewards of trajectories more accurately in comparison to random and greedy reward variants and is also robust against small errors or noise in scoring trajectories.
UR - http://www.scopus.com/inward/record.url?scp=85032889169&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85032889169&partnerID=8YFLogxK
U2 - 10.1109/AHS.2017.8046385
DO - 10.1109/AHS.2017.8046385
M3 - Conference contribution
AN - SCOPUS:85032889169
T3 - 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017
SP - 245
EP - 252
BT - 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017
Y2 - 24 July 2017 through 27 July 2017
ER -