作者: Ranjana Addanki , Sourav Maji , Malathi Veeraraghavan , Chris Tracy
DOI: 10.1109/EUCNC.2015.7194115
关键词:
摘要: Parallel TCP connections are used for large scientific dataset transfers to increase throughput. Therefore, accurately characterize big-data movement, it is important reconstruct parallel flowsets from traffic measurements. In this work, we start with NetFlow records collected in an operational research-and-education network across which datasets moved routinely, individual elephant flows the records, and assemble flows. Our findings as follows. The top 1% of flowset sizes were hundreds GBs low TBs range, 95% had rates less than 2.5 Gbps, 99% durations shorter 4 hours. Median rate increases variance decreases increasing number per-flowset component Such useful planning, engineering, improving user performance, since among most demanding applications.