Learning to Take Directions One Step at a Time

作者: Paolo Favaro , Matthias Zwicker , Qiyang Hu , Tiziano Portenier , Adrian Walchli

DOI: 10.1109/ICPR48806.2021.9412100

关键词: Object (computer science)MNIST databaseSequencePattern recognition (psychology)TrajectoryComputer visionMinimum bounding boxConsistency (database systems)Artificial intelligenceAnimationComputer science

摘要: We present a method to generate video sequence given single image. Because items in an image can be animated arbitrarily many different ways, we introduce as control signal of motion strokes. Such also automatically transferred from other videos, e.g., via bounding box tracking. Each stroke provides the direction moving object input and aim train network animation following such directions. To address this task design novel recurrent architecture, which trained easily effectively thanks explicit separation past, future current states. As demonstrate experiments, our proposed architecture is capable generating arbitrary number frames Key components are autoencoding constraint ensure consistency with past generative adversarial scheme that images look realistic temporally smooth. effectiveness approach on MNIST, KTH, Human3.6M, Push Weizmann datasets.

参考文章(43)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Yoshua Bengio, Tomas Mikolov, Razvan Pascanu, On the difficulty of training recurrent neural networks international conference on machine learning. pp. 1310- 1318 ,(2013)
Max Welling, Diederik P Kingma, Auto-Encoding Variational Bayes international conference on learning representations. ,(2014)
Chao Yin, Yan Gui, Zhifeng Xie, Lizhuang Ma, Shape Context Based Video Texture Synthesis from Still Images international conference on computational and information sciences. pp. 38- 42 ,(2011) , 10.1109/ICCIS.2011.252
James Davis, Maneesh Agrawala, Erika Chuang, Zoran Popović, David Salesin, A sketching interface for articulated figure animation ACM SIGGRAPH 2006 Courses on - SIGGRAPH '06. pp. 15- ,(2006) , 10.1145/1185657.1185776
C. Schuldt, I. Laptev, B. Caputo, Recognizing human actions: a local SVM approach international conference on pattern recognition. ,vol. 3, pp. 32- 36 ,(2004) , 10.1109/ICPR.2004.747
Sepp Hochreiter, Jürgen Schmidhuber, Long short-term memory Neural Computation. ,vol. 9, pp. 1735- 1780 ,(1997) , 10.1162/NECO.1997.9.8.1735
Bing-Yu Chen, Yutaka Ono, Tomoyuki Nishita, Character animation creation using hand-drawn sketches The Visual Computer. ,vol. 21, pp. 551- 558 ,(2005) , 10.1007/S00371-005-0333-Z
, Generative Adversarial Nets neural information processing systems. ,vol. 27, pp. 2672- 2680 ,(2014) , 10.3156/JSOFT.29.5_177_2
Catalin Ionescu, Dragos Papava, Vlad Olaru, Cristian Sminchisescu, Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 36, pp. 1325- 1339 ,(2014) , 10.1109/TPAMI.2013.248