作者: Serena Yeung , Vignesh Ramanathan , Olga Russakovsky , Liyue Shen , Greg Mori
关键词:
摘要: Understanding the simultaneously very diverse and intricately fine-grained set of possible human actions is a critical open problem in computer vision. Manually labeling training videos feasible for some action classes but doesnt scale to full long-tailed distribution actions. A promising way address this leverage noisy data from web queries learn new actions, using semi-supervised or webly-supervised approaches. However, these methods typically do not domain-specific knowledge, rely on iterative hand-tuned policies. In work, we instead propose reinforcement learning-based formulation selecting right examples classifier search results. Our method uses Q-learning policy small labeled dataset, then automatically label visual concepts. Experiments challenging Sports-1M recognition benchmark as well additional newly emerging demonstrate that our able good policies use accurate concept classifiers.