Selecting features in origin analysis

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

When applying a machine-learning approach to develop classifiers in a new domain, an important question is what measurements to take and how they will be used to construct informative features. This paper develops a novel set of machine-learning classifiers for the domain of classifying files taken from software projects; the target classifications are based on origin analysis. Our approach adapts the output of four copy-analysis tools, generating a number of different measurements. By combining the measures and the files on which they operate, a large set of features is generated in a semi-automatic manner. After which, standard attribute selection and classifier training techniques yield a pool of high quality classifiers (accuracy in the range of 90%), and information on the most relevant features. © 2011 Springer-Verlag London Limited.

Cite

CITATION STYLE

APA

Green, P., Lane, P. C. R., Rainer, A., & Scholz, S. B. (2011). Selecting features in origin analysis. In Res. and Dev. in Intelligent Syst. XXVII: Incorporating Applications and Innovations in Intel. Sys. XVIII - AI 2010, 30th SGAI Int. Conf. on Innovative Techniques and Applications of Artificial Intel. (pp. 379–392). Springer London. https://doi.org/10.1007/978-0-85729-130-1_29

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free