作者: Shiva Prasad , Huahua Wang , Arindam Banerjee , Prem Melville
DOI:
关键词:
摘要: Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a robust and scalable manner. In this paper, we introduce an approach for novel document detection based on online dictionary learning. Unlike traditional dictionary learning, which uses squared loss, the proposed formulation uses l1-loss for the reconstruction error. The online l1-dictionary learning problem is efficiently solved using the alternating directions method, and we establish a O (