作者: Jonas Karlsson
DOI:
关键词:
摘要: In many domains, the task can be decomposed into a set of independent sub-goals. Often, such tasks are too complex to learned using standard techniques as Reinforcement Learning. The complexity is caused by learning system having keep track status all sub-goals concurrently. Thus, if solution one sub-goal known when another in some given state, must relearned other changes. This dissertation presents modular approach reinforcement that takes advantage decomposition avoid unnecessary relearning. approach, modules created learn each sub-goal. Each module receives only those inputs relevant its associated sub-goal, and therefore without being affected state Furthermore, searches much smaller space than defined considered together, thereby greatly reducing time. Since learns how achieve separate at any time it may recommend an action different from recommended modules. To select best satisfies possible, simple arbitration strategy used. One strategy, explored this dissertation, called greatest mass which simply combines utilities selects with largest combined utility. Since limits separates information modules, necessarily differ standard, non-modular approach. However, experiments driving world indicate while sub-optimal, makes minor errors compared A thus very quickly, small amounts computational resources, sacrifices quality,