Abstract
We consider the problem of path planning in an initially unknown environment where a robot does not have an a priori map of its environment but has access to prior information accumulated by itself from navigation in similar but not identical environments. To address the navigation problem, we propose a novel, machine learning-based algorithm called Semi-Markov Decision Process with Unawareness and Transfer (SMDPU-T) where a robot records a sequence of its actions around obstacles as action sequences called options which are then reused by it within a framework called Markov Decision Process with unawareness (MDPU) to learn suitable, collision-free maneuvers around more complex obstacles in future. We have analytically derived the cost bounds of the selected option by SMDPU-T and the worst case time complexity of our algorithm. Our experimental results on simulated robots within Webots simulator illustrate that SMDPU-T takes 24 % planning time and 39 % total time to solve same navigation tasks while, our hardware results on a Turtlebot robot indicate that SMDPU-T on average takes 53 % planning time and 60 % total time as compared to a recent, sampling-based path planner.
Original language | English (US) |
---|---|
Pages (from-to) | 2071-2093 |
Number of pages | 23 |
Journal | Autonomous Robots |
Volume | 43 |
Issue number | 8 |
DOIs | |
State | Published - Dec 1 2019 |
Keywords
- Markov decision processes with unawareness
- Options
- Reinforcement learning
- Robot path planning
- Transfer learning
ASJC Scopus subject areas
- Artificial Intelligence