作者: Jonathan Crussell , Philip Kegelmeyer
DOI: 10.1137/1.9781611974010.27
关键词:
摘要: Many security applications depend critically on clustering. However, we do not know of any clustering algorithms that were designed with an adversary in mind. An intelligent may be able to use this her advantage subvert the application. Already, adversaries obfuscation and other techniques alter representation their inputs feature space avoid detection. As one example, spam email often mimics normal email. In work, investigate a more active attack, which attempts analysis by feeding carefully crafted data points. Specifically, work explore how attacker can DBSCAN, popular density-based algorithm. We “confidence attack,” where seeks poison clusters point defender loses confidence utility system. This result system being abandoned, or worse, waste defender’s time investigating false alarms. While our attacks generalize all DBSCANbased tools, focus evaluation AnDarwin, tool detect plagiarized Android apps. show merge arbitrary connecting them “bridges”, even small number merges greatly degrade performance, has limited recourse when relying solely DBSCAN. Finally, propose remediation process uses machine learning features based outlier measures are orthogonal underlying problem remove injected