Systems and methods for providing a content item database and identifying content items

作者: Saurabh Sohoney , Rakesh Nigam , Santhosh Baramasagara Chandrasekharappa , Sivakumar Ekambaram

DOI:

关键词:

摘要: Systems and methods are provided for identifying unsolicited or unwanted electronic communications, such as spam. The disclosed embodiments also encompass systems selecting content items from a item database. Consistent with certain embodiments, computer-implemented may use clustering based statistical matching anti-spam algorithm to identify filter Such be implemented determine degree of similarity between an incoming e-mail collection one more spam e-mails stored in If the exceeds predetermined threshold, classified Further, accordance other query search string user database identified that matches by user.

参考文章(39)
Jonathan J. Oliver, Andrew F. Oliver, Approximate matching of strings for message filtering ,(2007)
Chandler L. Burgess, Robert C. Farrow, Douglas J. Matzke, Identifying Relationships Among Database Records ,(2007)
Sandy Jensen, Eli Mantel, Matt Gleeson, Art Medlar, David Hoogstrate, Ken Schneider, Method and apparatus for filtering email spam based on similarity measures ,(2004)
Michael W. Watzke, Bhashyam Ramesh, Clustering strings using N-grams ,(2003)
Joshua Alspector, Abdur R. Chowdhury, Aleksander Kolcz, Reliability of duplicate document detection algorithms ,(2011)
Ryan Colvin, Kevin Chan, Chad Mills, Aleksander Kolcz, Robert McCann, Detecting spam from metafeatures of an email message ,(2008)
Dennis A. Tillotson, Lori K. Lewis, Rosemary D. Paradis, Self-optimizing classifier ,(2003)