作者: Deva Ramanan , Vaishaal Shankar , Ludwig Schmidt , Rebecca Roelofs , Achal Dave
DOI:
关键词:
摘要: Vision models notoriously flicker when applied to videos: they correctly recognize objects in some frames, but fail on perceptually similar, nearby frames. In this work, we systematically …