作者: Raymond J. Mooney , Tanvi S. Motwani
DOI: 10.3233/978-1-61499-098-7-600
关键词:
摘要: Recognizing activities in real-world videos is a challenging AI problem. We present novel combination of standard activity classification, object recognition, and text mining to learn effective recognizers without ever explicitly labeling training videos. cluster verbs used describe automatically discover classes produce labeled set. This data then train an classifier based on spatio-temporal features. Next, employed the correlations between these related objects. knowledge together with outputs off-the-shelf recognizer trained improved recognizer. Experiments corpus YouTube demonstrate effectiveness overall approach.