Automatic categorization algorithm for evolvable software archive

作者: S. Kawaguchi , P.K. Garg , M. Matsushita , K. Inoue , Z. Source

DOI: 10.1109/IWPSE.2003.1231227

关键词: Software engineeringSoftware systemSoftware metricAlgorithmSoftware evolutionSoftware sizingComputer scienceSoftware developmentSoftware constructionSoftware analyticsSoftware verification and validation

摘要: The number of software systems is increasing at a rapid rate. For example, SourceForge currently has about sixty thousand registered, twenty-two which were added in the past twelve months. It important for evolution to search and use existing similar from archive. An history an system useful. We may even evolve based on one instead creating it scratch. propose automatic categorization algorithm help finding At present, we leave open issue nature categorization, explore several known approaches including code clones-based similarity metric, decision trees, latent semantic analysis. results applying each gives us some insights into problem space, sets directions further work.

参考文章(11)
Susan T. Dumais, Thomas Landauer, Latent semantic analysis and the measurement of knowledge ,(1994)
Robert W. Schwanke, An intelligent tool for re-engineering software modularity international conference on software engineering. pp. 83- 92 ,(1991) , 10.5555/256664.256688
S.C. Choi, W. Scacchi, Extracting and restructuring the design of large systems IEEE Software. ,vol. 7, pp. 66- 71 ,(1990) , 10.1109/52.43051
Nicolas Anquetil, Timothy C. Lethbridge, Recovering software architecture from the names of source files Journal of Software Maintenance: Research and Practice archive. ,vol. 11, pp. 201- 221 ,(1999) , 10.1002/(SICI)1096-908X(199905/06)11:3<201::AID-SMR192>3.0.CO;2-1
Tetsuo Yamamoto, Makoto Matsushita, Toshihiro Kamiya, Katsuro Inoue, Measuring Similarity of Large Software Systems Based on Source Code Correspondence Product Focused Software Process Improvement. pp. 530- 544 ,(2005) , 10.1007/11497455_41
J.I. Maletic, A. Marcus, Using latent semantic analysis to identify similarities in source code to support program understanding conference on tools with artificial intelligence. pp. 46- 53 ,(2000) , 10.1109/TAI.2000.889845
Jamie Dinkelacker, Dean Nelson, Rob Miller, Pankaj K. Garg, Progressive open source international conference on software engineering. pp. 177- 184 ,(2002) , 10.1145/581339.581363
T. Kamiya, S. Kusumoto, K. Inoue, CCFinder: a multilinguistic token-based code clone detection system for large scale source code IEEE Transactions on Software Engineering. ,vol. 28, pp. 654- 670 ,(2002) , 10.1109/TSE.2002.1019480
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, Indexing by Latent Semantic Analysis Journal of the Association for Information Science and Technology. ,vol. 41, pp. 391- 407 ,(1990) , 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
Jonathan I. Maletic, Andrian Marcus, Recovering documentation-to-source-code traceability links using latent semantic indexing international conference on software engineering. pp. 125- 135 ,(2003) , 10.5555/776816.776832