Learning to log: helping developers make informed logging decisions

作者: Pinjia He , Dongmei Zhang , Jieming Zhu , Michael R. Lyu , Qiang Fu

DOI: 10.5555/2818754.2818807

关键词:

摘要: Logging is a common programming practice of practical importance to collect system runtime information for postmortem analysis. Strategic logging placement desired cover necessary without incurring unintended consequences (e.g., Performance overhead, trivial logs). However, in current practice, there lack rigorous specifications developers govern their behaviours. has become an important yet tough decision which mostly depends on the domain knowledge developers. To reduce effort making decisions, this paper, we propose "learning log" framework, aims provide informative guidance during development. As proof concept, design and implementation suggestion tool, Log Advisor, automatically learns practices where log from existing instances further leverages them actionable suggestions Specifically, identify factors determining extract as structural features, textual syntactic features. Then, by applying machine learning techniques Feature selection classifier learning) noise handling techniques, achieve high accuracy suggestions. We evaluate Advisor two industrial software systems Microsoft open-source Git Hub (totally 19.1M LOC 100.6K statements). The encouraging experimental results, well user study, demonstrate feasibility effectiveness our tool. believe work can serve first step towards goal log".

参考文章(37)
Hinrich Schütze, Christopher D. Manning, Prabhakar Raghavan, Introduction to Information Retrieval ,(2005)
Karthik Nagaraj, Charles Killian, Jennifer Neville, Structured comparative analysis of systems logs to diagnose performance problems networked systems design and implementation. pp. 26- 26 ,(2012)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Charu C. Aggarwal, ChengXiang Zhai, A survey of text classification algorithms Mining Text Data. pp. 163- 222 ,(2012) , 10.1007/978-1-4614-3223-4_6
Terence Kelly, Ira Cohen, Julie Symons, Jeffrey S. Chase, Moises Goldszmidt, Correlating instrumentation data to system states: a building block for automated diagnosis and control operating systems design and implementation. pp. 16- 16 ,(2004)
Ahmed E. Hassan, Zhen Ming Jiang, Weiyi Shang, Hadi Hemmati, Brain Adams, Patrick Martin, Assisting developers of big data analytics applications when deploying on hadoop clouds international conference on software engineering. pp. 402- 411 ,(2013) , 10.5555/2486788.2486842
Weiyi Shang, Meiyappan Nagappan, Ahmed E. Hassan, Studying the relationship between logging characteristics and the code quality of platform software Empirical Software Engineering. ,vol. 20, pp. 1- 27 ,(2015) , 10.1007/S10664-013-9274-8
Ripon K Saha, Matthew Lease, Sarfraz Khurshid, Dewayne E Perry, None, Improving bug localization using structured information retrieval automated software engineering. pp. 345- 355 ,(2013) , 10.1109/ASE.2013.6693093
Sushil K. Bajracharya, Joel Ossher, Cristina V. Lopes, Leveraging usage similarity for effective retrieval of examples in code repositories foundations of software engineering. pp. 157- 166 ,(2010) , 10.1145/1882291.1882316
Wujie Zheng, Qirun Zhang, Michael Lyu, Cross-library API recommendation using web search engines foundations of software engineering. pp. 480- 483 ,(2011) , 10.1145/2025113.2025197