Finding one's best crowd: Online learning by exploiting source similarity

0Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

We consider an online learning problem (classification or prediction) involving disparate sources of sequentially arriving data, whereby a user over time learns the best set of data sources to use in constructing the classifier by exploiting their similarity.We first show that, when (1) the similarity information among data sources is known, and (2) data from different sources can be acquired without cost, then a judicious selection of data from different sources can effectively enlarge the training sample size compared to using a single data source, thereby improving the rate and performance of learning; this is achieved by bounding the classification error of the resulting classifier. We then relax assumption (1) and characterize the loss in learning performance when the similarity information must also be acquired through repeated sampling. We further relax both (1) and (2) and present a cost-efficient algorithm that identifies a best crowd from a potentially large set of data sources in terms of both classifier performance and data acquisition cost. This problem has various applications, including online prediction systems with time series data of various forms, such as financial markets, advertisement and network measurement.

Cite

CITATION STYLE

APA

Liu, Y., & Liu, M. (2016). Finding one’s best crowd: Online learning by exploiting source similarity. In 30th AAAI Conference on Artificial Intelligence, AAAI 2016 (pp. 1895–1901). AAAI press. https://doi.org/10.1609/aaai.v30i1.10273

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free