作者: Mohamed Oubbati , Christian Fischer , Günther Palm
DOI: 10.1007/978-3-319-08864-8_16
关键词:
摘要: Goal-driven agents are generally expected to be capable of pursuing simultaneously a variety goals. As these goals may compete in certain circumstances, the agent must able constantly trade them off and shift their priorities rational way. One aspect rationality is evaluate its needs make decisions accordingly. We endow with set needs, or drives, that change over time as function external stimuli internal consumption, decision making process hast generate actions maintain balance between needs. The proposed framework pursues an approach which considered multiobjective problem approximately solved using hierarchical reinforcement learning architecture. At higher-level, Q-learning learns select best strategy improves well-being agent. lower-level, actor-critic design executes selected while interacting continuous, partially observable environment. provide simulation results demonstrate efficiency approach.