Movie genre classification: A multi-label approach based on convolutions through time

作者: Jônatas Wehrmann , Rodrigo C. Barros

DOI: 10.1016/J.ASOC.2017.08.029

关键词:

摘要: Abstract The task of labeling movies according to their corresponding genre is a challenging classification problem, having in mind that an immaterial feature cannot be directly pinpointed any the movie frames. Hence, off-the-shelf image approaches are not capable handling this straightforward fashion. Moreover, may belong multiple genres at same time, making assignment typical multi-label which per se much more than standard single-label classification. In paper, we propose novel deep neural architecture based on convolutional networks (ConvNets) for performing movie-trailer It encapsulates ultra-deep ConvNet with residual connections, and it makes use special layer extract temporal information from image-based features prior mapping trailers genres. We compare proposed approach current state-of-the-art methods employ well-known descriptors other low-level handcrafted features. Results show our method substantially outperforms task, improving performance all

参考文章(37)
Amir Roshan Zamir, Khurram Soomro, Mubarak Shah, UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild arXiv: Computer Vision and Pattern Recognition. ,(2012)
Ioannis Katakis, O Maimon, Grigorios Tsoumakas, Ioannis Vlahavas, Mining Multi-label Data Data Mining and Knowledge Discovery Handbook. pp. 667- 685 ,(2009) , 10.1007/978-0-387-09823-4_34
Ting-Fan Wu, Chih-Jen Lin, Ruby Weng, None, Probability Estimates for Multi-class Classification by Pairwise Coupling Journal of Machine Learning Research. ,vol. 5, pp. 975- 1005 ,(2004) , 10.5555/1005332.1016791
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks 2015 IEEE International Conference on Computer Vision (ICCV). pp. 4489- 4497 ,(2015) , 10.1109/ICCV.2015.510
Aude Oliva, Antonio Torralba, Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope International Journal of Computer Vision. ,vol. 42, pp. 145- 175 ,(2001) , 10.1023/A:1011139631724
Ricardo Cerri, Rodrigo C Barros, André CPLF de Carvalho, None, Hierarchical classification of Gene Ontology-based protein functions with neural networks international joint conference on neural network. pp. 1- 8 ,(2015) , 10.1109/IJCNN.2015.7280474
Jesse Davis, Mark Goadrich, The relationship between Precision-Recall and ROC curves Proceedings of the 23rd international conference on Machine learning - ICML '06. ,vol. 148, pp. 233- 240 ,(2006) , 10.1145/1143844.1143874
Shuiwang Ji, Wei Xu, Ming Yang, Kai Yu, 3D Convolutional Neural Networks for Human Action Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 35, pp. 221- 231 ,(2013) , 10.1109/TPAMI.2012.59
Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, Dong Yu, Convolutional neural networks for speech recognition IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 22, pp. 1533- 1545 ,(2014) , 10.1109/TASLP.2014.2339736
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei, Large-Scale Video Classification with Convolutional Neural Networks computer vision and pattern recognition. pp. 1725- 1732 ,(2014) , 10.1109/CVPR.2014.223