Software systems for social media analysis provide algorithms and tools for extracting useful knowledge from user-generated social media data. ParSoDA (Parallel Social Data Analytics) is a Java library for developing parallel data analysis applications based on the extraction of useful knowledge from social media data. This library aims at reducing the programming skills necessary to implement scalable social data analysis applications. This work describes how the ParSoDA library has been extended to execute applications on Apache Spark. Using a cluster of 12 workers, the Spark version of the library reduces the execution time of two case study applications exploiting social media data up to 42%, compared to the Hadoop version of the library.
CITATION STYLE
Belcastro, L., Marozzo, F., Talia, D., & Trunfio, P. (2018). Appraising SPARK on large-scale social media analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10659 LNCS, pp. 483–495). Springer Verlag. https://doi.org/10.1007/978-3-319-75178-8_39
Mendeley helps you to discover research relevant for your work.