Efficient visualization of document streams

2Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In machine learning and data mining, multidimensional scaling (MDS) and MDS-like methods are extensively used for dimensionality reduction and for gaining insights into overwhelming amounts of data through visualization. With the growth of the Web and activities of Web users, the amount of data not only grows exponentially but is also becoming available in the form of streams, where new data instances constantly flow into the system, requiring the algorithm to update the model in near-real time. This paper presents an algorithm for document stream visualization through a MDS-like distance-preserving projection onto a 2D canvas. The visualization algorithm is essentially a pipeline employing several methods from machine learning. Experimental verification shows that each stage of the pipeline is able to process a batch of documents in constant time. It is shown that in the experimental setting with a limited buffer capacity and a constant document batch size, it is possible to process roughly 2.5 documents per second which corresponds to approximately 25% of the entire blogosphere rate and should be sufficient for most real-life applications. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Grčar, M., Podpečan, V., Juršič, M., & Lavrač, N. (2010). Efficient visualization of document streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6332 LNAI, pp. 174–188). https://doi.org/10.1007/978-3-642-16184-1_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free