Public review for the devil and packet trace anonymization

作者: Matt Roughan

DOI: 10.1145/1111322.1111329

关键词: Data accessComponent (UML)TRACE (psycholinguistics)Flexibility (engineering)Set (psychology)The InternetComputer scienceComputer security

摘要: Reproducible research hinges on the ability to use common datasets, and so public datasets containing measurements of real, operational networks are important for Internet measurement community, subsequently all those fields networking that their research. There several such available now, but given ongoing evolution Internet, more always needed. However, data is often considered proprietary, or its release raises privacy security concerns, access limited organization collected itAnonymization has sometimes been used remove offending components a dataset remainder can be made public.As with work, Devil in detail. Different organizations have own requirements anonymization, while other hand, researchers different interests dataset. Particular aspects an anonymization may component interest study. Anonymization policies balance value It challenging problem --- as one reviewer put it "the difference between researcher attacker cannot expressed pure analysis terms. motivation funding source makes difference.In studying issues organization, authors this paper sought not add new tool would perform according policies, rather they aimed create (freely available) 'tcpmkpub,' which allow implement complicated, multifaceted considerably beyond capabilities existing tools. The also provides considerable discussion surrounding finally large (11 GB set packet traces anonymized using tool)The reviewers were largely positive about paper, instance saying "it level thoroughness at anonymized," "The present able fulfil task [anonymization] elegant way whose flexibility promises make easy adapt environments." two noted fundamental making trace totally proof solved here, though highly challenging, perhaps solvable there plenty remaining performed area

参考文章(0)