作者: Gary Kacmarcik , Michael Gamon
关键词:
摘要: This paper explores techniques for reducing the effectiveness of standard authorship attribution so that an author A can preserve anonymity a particular document D. We discuss feature selection and adjustment show how this information be fed back to create new D' which calculated moves away from A. Since it labor intensive adjust in fashion, we attempt quantify amount effort required produce anonymized introduce two levels anonymization: shallow deep. In our test set, anonymization achieved by making 14 changes per 1000 words reduce likelihood identifying as average more than 83%. For deep anonymization, adapt unmasking work Koppel Schler provide feedback allows choose level anonymization.