On the choice of kernel and labelled data in semi-supervised learning methods

14Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Semi-supervised learning methods constitute a category of machine learning methods which use labelled points together with unlabelled data to tune the classifier. The main idea of the semi-supervised methods is based on an assumption that the classification function should change smoothly over a similarity graph, which represents relations among data points. This idea can be expressed using kernels on graphs such as graph Laplacian. Different semi-supervised learning methods have different kernels which reflect how the underlying similarity graph influences the classification results. In the present work, we analyse a general family of semi-supervised methods, provide insights about the differences among the methods and give recommendations for the choice of the kernel parameters and labelled points. In particular, it appears that it is preferable to choose a kernel based on the properties of the labelled points. We illustrate our general theoretical conclusions with an analytically tractable characteristic example, clustered preferential attachment model and classification of content in P2P networks. © 2013 Springer International Publishing.

Cite

CITATION STYLE

APA

Avrachenkov, K., Gonçalves, P., & Sokol, M. (2013). On the choice of kernel and labelled data in semi-supervised learning methods. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8305 LNCS, pp. 56–67). https://doi.org/10.1007/978-3-319-03536-9_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free