DeFlaker: automatically detecting flaky tests

作者： Jonathan Bell , Owolabi Legunsen , Michael Hilton , Lamyaa Eloussi , Tifany Yung

关键词:

摘要: Developers often run tests to check that their latest changes a code repository did not break any previously working functionality. Ideally, new test failures would indicate regressions caused by the changes. However, some may be due but non-determinism in tests, popularly called flaky tests. The typical way detect is rerun failing repeatedly. Unfortunately, rerunning can costly and slow down development cycle.We present first extensive evaluation of propose technique, DeFlaker, detects if failure without with very low runtime overhead. DeFlaker monitors coverage marks as newly execute We deployed live, build process 96 Java projects on TravisCI, found 87 unknown 10 these projects. also ran experiments project histories, where detected 1, 874 from 4, 846 failures, false alarm rate (1.5%). had higher recall (95.5% vs. 23%) confirmed than Maven's default detector.

参考文章(54)

Emelie Engström, Per Runeson, Mats Skoglund, A systematic review on regression test selection techniques Information & Software Technology. ,vol. 52, pp. 14- 30 ,(2010) , 10.1016/J.INFSOF.2009.07.001

Sai Zhang, Darioush Jalali, Jochen Wuttke, Kıvanç Muşlu, Wing Lam, Michael D. Ernst, David Notkin, Empirically revisiting the test independence assumption Proceedings of the 2014 International Symposium on Software Testing and Analysis - ISSTA 2014. pp. 385- 396 ,(2014) , 10.1145/2610384.2610404

Mustafa M. Tikir, Jeffrey K. Hollingsworth, Efficient instrumentation for code coverage testing ACM SIGSOFT Software Engineering Notes. ,vol. 27, pp. 86- 96 ,(2002) , 10.1145/566171.566186

Todd L. Graves, Mary Jean Harrold, Jung-Min Kim, Adam Porter, Gregg Rothermel, An empirical study of regression test selection techniques ACM Transactions on Software Engineering and Methodology. ,vol. 10, pp. 184- 208 ,(2001) , 10.1145/367008.367020

Arash Vahabzadeh, Amin Milani Fard, Ali Mesbah, An empirical study of bugs in test code international conference on software maintenance. pp. 101- 110 ,(2015) , 10.1109/ICSM.2015.7332456

Leandro Sales Pinto, Saurabh Sinha, Alessandro Orso, Understanding myths and realities of test-suite evolution Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering - FSE '12. pp. 33- ,(2012) , 10.1145/2393596.2393634

Wei Jin, Alessandro Orso, Tao Xie, Automated Behavioral Regression Testing international conference on software testing, verification, and validation. pp. 137- 146 ,(2010) , 10.1109/ICST.2010.64

Nanjuan Shi, Mary Jean Harrold, Scaling regression testing to large software systems foundations of software engineering. ,vol. 29, pp. 241- 251 ,(2004) , 10.1145/1029894.1029928

Swarnendu Biswas, Rajib Mall, Manoranjan Satpathy, Srihari Sukumaran, Regression Test Selection Techniques: A Survey Informatica (lithuanian Academy of Sciences). ,vol. 35, pp. 289- 321 ,(2011)

10.

M.J. Harrold, D. Rosenblum, G. Rothermel, E. Weyuker, Empirical studies of a prediction model for regression test selection IEEE Transactions on Software Engineering. ,vol. 27, pp. 248- 263 ,(2001) , 10.1109/32.910860

DeFlaker: automatically detecting flaky tests

来源期刊

我的账户

DeFlaker: automatically detecting flaky tests

来源期刊

相似文章 10

我的账户