A Faster Method for Tracking and Scoring Videos Corresponding to Sentences.

作者: Jeffrey Mark Siskind , Daniel Paul Barrett , Haonan Yu

DOI:

关键词: Object (grammar)Tracking (particle physics)Function (mathematics)SentenceSpace (punctuation)Speech recognitionScale (descriptive set theory)Word (computer architecture)Computer scienceSentence length

摘要: Prior work presented the sentence tracker, a method for scoring how well describes video clip or alternatively depicts sentence. We present an improved optimizing same cost function employed by this prior work, reducing space complexity from exponential in length to polynomial, as producing qualitatively identical result time polynomial instead of exponential. Since new is plug-compatible with method, it can be used applications: retrieval sentential queries, generating descriptions clips, and focusing attention tracker sentence, while allowing these applications scale significantly larger numbers object detections, word meanings modeled HMMs states, longer sentences, no appreciable degradation quality results.

参考文章(38)
C. Lawrence Zitnick, Piotr Dollár, Edge Boxes: Locating Object Proposals from Edges Computer Vision – ECCV 2014. pp. 391- 405 ,(2014) , 10.1007/978-3-319-10602-1_26
Shi Zhong, Joydeep Ghosh, A New Formulation of Coupled Hidden Markov Models ,(2001)
Zeeshan Hayder, Mathieu Salzmann, Xuming He, None, Object Co-detection via Efficient Inference in a Fully-Connected CRF european conference on computer vision. pp. 330- 345 ,(2014) , 10.1007/978-3-319-10578-9_22
Armand Joulin, Kevin Tang, Li Fei-Fei, Efficient Image and Video Co-localization with Frank-Wolfe Algorithm european conference on computer vision. pp. 253- 268 ,(2014) , 10.1007/978-3-319-10599-4_17
P. Werbos, Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences Ph. D. dissertation, Harvard University. ,(1974)
B. Speelpenning, Compiling fast partial derivatives of functions given by algorithms University of Illinois at Urbana-Champaign. ,(1980) , 10.2172/5254402
J. Yamato, J. Ohya, K. Ishii, Recognizing human action in time-sequential images using hidden Markov model computer vision and pattern recognition. pp. 379- 385 ,(1992) , 10.1109/CVPR.1992.223161
Kevin Tang, Armand Joulin, Li-Jia Li, Li Fei-Fei, Co-localization in Real-World Images computer vision and pattern recognition. pp. 1464- 1471 ,(2014) , 10.1109/CVPR.2014.190
Alessandro Prest, C. Leistner, J. Civera, C. Schmid, V. Ferrari, Learning object class detectors from weakly annotated video computer vision and pattern recognition. pp. 3282- 3289 ,(2012) , 10.1109/CVPR.2012.6248065