作者: Xuxian Jiang , Dongyan Xu , Zhiqiang Lin , Xiangyu Zhang
DOI:
关键词: Computer science 、 Byte 、 Reverse engineering 、 Dynamic Host Configuration Protocol 、 Call stack 、 Offset (computer science) 、 Open Shortest Path First 、 Operating system 、 Binary number 、 Malware
摘要: Protocol reverse engineering has often been a manual process that is considered time-consuming, tedious and error-prone. To address this limitation, number of solutions have recently proposed to allow for automatic protocol engineering. Unfortunately, they are either limited in extracting fields due lack program semantics network traces or primitive only revealing the flat structure format. In paper, we present system called AutoFormat aims at not with high accuracy, but also inherently “non-flat”, hierarchical structures messages. based on key insight different same message typically handled execution contexts (e.g., runtime call stack). As such, by monitoring execution, can collect context information every byte (annotated its offset entire message) cluster them derive We evaluated our more than 30 messages from seven protocols, including two text-based protocols (HTTP SIP), three binary-based (DHCP, RIP, OSPF), one hybrid (CIFS/SMB), as well unknown used real-world malware. Our results show identify individual automatically accuracy (an average 93.4% match ratio compared Wireshark), unveil format possible relations sequential, parallel, hierarchical) among fields. ∗Part research supported National Science Foundation under grants CNS-0716376 CNS-0716444. The bulk work was performed when first author visiting George Mason University Summer 2007.