Pattern recognition systems are trained using a finite number of samples. Feature dimensionality reduction is essential to improve generalization and optimal exploitation of the information content in the feature vector. Dimensionality reduction eliminates redundant dimensions that do not convey reliable statistical information for classification, determines a manifold on which projections of the original high- dimensional feature vector exhibit maximal information about the class label, and reduces the complexity of the classifier to help avoid over-fitting. In this chapter we demonstrate computationally efficient algorithms for estimating and optimizing mutual information, specifically for the purpose of learning optimal feature dimensionality reduction solutions in the context of pattern recognition. These techniques and algorithms will be applied to the classification of multichannel EEG signals for brain computer interface design, as well as sonar imagery for target detection. Results will be compared with widely used benchmark alternatives such as LDA and kernel LDA. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Erdogmus, D., Ozertem, U., & Lan, T. (2008). Information theoretic feature selection and projection. Studies in Computational Intelligence, 83, 1–22. https://doi.org/10.1007/978-3-540-75398-8_1
Mendeley helps you to discover research relevant for your work.