作者: Yeonhee Lee , Wonchul Kang , Youngseok Lee
DOI: 10.1007/978-3-642-20305-3_5
关键词:
摘要: Internet traffic measurement and analysis has become a significantly challenging job because large packet trace files captured on fast links could not be easily handled single server with limited computing memory resources. Hadoop is popular open-source cloud platform that provides software programming framework called MapReduce the distributed filesystem, HDFS, which are useful for analyzing data set. Therefore, in this paper, we present Hadoopbased processing tool scalability set by harnessing HDFS. To tackle efficiently, devised new binary input format, PcapInputFormat, hiding complexity of binary-formatted parsing each record. We also designed efficient models consisting map reduce functions. evaluate our tool, compared its computation time well-known packet-processing CoralReef, showed approach more affordable to process data.