Large-Scale Learning from Data Streams with Apache Samoa

Citations of this article
Mendeley users who have this article in their library.
Get full text
This PDF is freely available from an open access repository. It may not have been peer-reviewed.


Apache SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams. Big data is defined as datasets whose size is beyond the ability of typical software tools to capture, store, manage, and analyze, due to the time and memory complexity. Apache SAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Apache Flink, Apache Storm, and Apache Samza. Apache SAMOA is written in Java and is available at under the Apache Software License version 2.0.




Kourtellis, N., Morales, G. D. F., & Bifet, A. (2018, May 26). Large-Scale Learning from Data Streams with Apache Samoa. ArXiv. arXiv.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free