Building a Wide-Area File Transfer Performance Predictor: An Empirical Study

作者： Zhengchun Liu , Rajkumar Kettimuthu , Prasanna Balaprakash , Nageswara S. V. Rao , Ian Foster

DOI: 10.1007/978-3-030-19945-6_5

关键词:

摘要: Wide-area data transfer is central to geographically distributed scientific workflows. Faster delivery of important for these Predictability equally (or even more) important. With the goal providing a reasonably accurate estimate time improve resource allocation & scheduling workflows and enable end-to-end optimization, we apply machine learning methods develop predictive models times over variety wide area networks. To build evaluate models, use 201,388 transfers, involving 759 million files totaling 9 PB transferred, 115 heavily used source-destination pairs (“edges”) between 135 unique endpoints. We different retraining frequencies window size history data. In best case, resulting have median prediction error \(\le \)21% 50% edges, \)32% 75% edges. present detailed analysis results that provides insights into cause some high errors. envision performance predictor will be informative geo-distributed The also suggest obvious directions both further service optimization.

springer.com 本地加速

sci-hub.se PDF 下载加速

参考文章(50)

Hadrien Hours, Ernst Biersack, Patrick Loiseau, A Causal Approach to the Study of TCP Performance ACM Transactions on Intelligent Systems and Technology. ,vol. 7, pp. 25- ,(2015) , 10.1145/2770878

Brian Tierney, William Johnston, Brian Crowley, Gary Hoo, Chris Brooks, Dan Gunter, The NetLogger Methodology for High Performance Distributed Systems Performance Analysis Lawrence Berkeley National Laboratory. ,(1999) , 10.2172/764331

S. Vazhkudai, J.M. Schopf, I. Foster, Predicting the performance of wide area data transfers international parallel and distributed processing symposium. pp. 270- ,(2002) , 10.1109/IPDPS.2002.1015510

Jerome H. Friedman, Greedy function approximation: A gradient boosting machine. Annals of Statistics. ,vol. 29, pp. 1189- 1232 ,(2001) , 10.1214/AOS/1013203451

Tin Kam Ho, Random decision forests international conference on document analysis and recognition. ,vol. 1, pp. 278- 282 ,(1995) , 10.1109/ICDAR.1995.598994

Bill Allcock, Joe Bester, John Bresnahan, Ann L. Chervenak, Ian Foster, Carl Kesselman, Sam Meder, Veronika Nefedova, Darcy Quesnel, Steven Tuecke, Data management and transfer in high-performance computational grid environments parallel computing. ,vol. 28, pp. 749- 771 ,(2002) , 10.1016/S0167-8191(02)00094-7

Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504

Arthur E. Hoerl, Robert W. Kennard, Ridge Regression: Applications to Nonorthogonal Problems Technometrics. ,vol. 12, pp. 69- 82 ,(1970) , 10.1080/00401706.1970.10488635

Syed Munir Hussain Shah, Altaf ur Rehman, Abdul Nasir Khan, Mehtab Arif Shah, None, TCP throughput estimation: A new neural networks model international conference on emerging technologies. pp. 94- 98 ,(2007) , 10.1109/ICET.2007.4516323

10.

JangYoung Kim, Esma Yildirim, Tevfik Kosar, A Highly-Accurate and Low-Overhead Prediction Model for Transfer Throughput Optimization ieee international conference on high performance computing data and analytics. ,vol. 18, pp. 787- 795 ,(2012) , 10.1109/SC.COMPANION.2012.109

Building a Wide-Area File Transfer Performance Predictor: An Empirical Study

来源期刊

我的账户

Building a Wide-Area File Transfer Performance Predictor: An Empirical Study

来源期刊

相似文章 3

Characterization and identification of HPC applications at leadership computing facility

Performance Prediction of Big Data Transfer Through Experimental Analysis and Machine Learning

Exploratory analysis and performance prediction of big data transfer in High-performance Networks

我的账户