摘要: Context. Software defect prediction aims to reduce the large costs involved with faults in a software system. A wide range of traditional metrics have been evaluated as potential indicators. These are derived from source code or development process. Studies shown that no metric clearly out performs another and identifying defect-prone using has reached performance ceiling. Less studied, these being natural language code. newer, less finer grained promise within prediction. Aims. The aim this dissertation is study relationship between short Java constructs faultiness To introduces concept sequence snippet. Sequences created by abstract syntax tree. ordering nodes tree creates sequences, while small subsequences snippets. tries find snippets faulty non-faulty This also looks at evolution system matures, discover whether significantly associated change over time. Methods. achieve dissertation, two main techniques developed; finding defective extracting sequences Finding split into areas fix insertion points. points an implementation bug-linking algorithm developed, called S + e . Two algorithms were developed extract analysed binomial test which ones performed on five different datasets; ArgoUML, AspectJ three releases Eclipse.JDT.core Results. There significant associations some Frequently occurring fault-prone include those identifiers, method calls variables. always 201 across all systems. technique unable any seems evolves more becoming evolved analysed. Conclusions. introduced engineering use offers promising approach potentially Unlike previous approaches, based comprehensive analysis low level features allow full set defects be identified. Initial research related faults. provided additional empirical evidence already researched bad located systems, although indicators transfer successfully one rare.