作者: Taylor Annette Kessler Faulkner , None
DOI:
关键词:
摘要: The ability to adapt and learn can help robots deployed in dynamic and varied environments. While in the wild, the data that robots have access to includes input from their sensors and the humans around them. The ability to utilize human data increases the usable information in the environment. However, human data can be noisy, particularly when acquired from non-experts. Rather than requiring expert teachers for learning robots, which is expensive, my research addresses methods for learning from imperfect human teachers. These methods use Human-in-the-loop Reinforcement Learning, which gives robots a reward function and input from human teachers. This dissertation shows that actively modifying which states receive feedback from imperfect, unmodeled human teachers can improve the speed and dependability of Human-In-the-loop Reinforcement Learning (HRL). This body of work addresses a bipartite model of imperfect teachers, in which humans can be inattentive or inaccurate. First, I present two algorithms for learning from inattentive teachers, which take advantage of intermittent attention from humans by adjusting state-action exploration to improve the learning speed of a Markovian HRL algorithm and give teachers more free time to complete other tasks. Second, I present two algorithms for learning from inaccurate teachers who give incorrect information to a robot. These algorithms estimate areas of the state space that are likely to receive incorrect feedback from human teachers, and can be used to filter messy, inaccurate data into information that is usable by a robot, performing dependably over a wide variety of inputs …