作者: Shin-ichi Maeda , Yasuhiro Fujita
DOI:
关键词: Bounded function 、 Variance (accounting) 、 Mathematical optimization 、 Control (management) 、 Estimator 、 Action (philosophy) 、 Computer science
摘要: Many continuous control tasks have bounded action spaces. When policy gradient methods are applied to such tasks, out-of-bound actions need to be clipped before execution, while …