作者: Bettina Könighofer , Scott Niekum , Roderick Bloem , Ufuk Topcu , Ruediger Ehlers
DOI:
关键词:
摘要: … , we achieve safe reinforcement learning, which we … 1 Safe RL is the process of learning an optimal policy while satisfying a temporal logic safety specification ϕs during the learning …