Tracking topics on social media streams is non-Trivial as the number of topics mentioned grows without bound. This complexity is compounded when we want to track such topics against other fast moving streams. We go beyond traditional small scale topic tracking and consider a stream of topics against another document stream. We introduce two tracking approaches which are fully applicable to true streaming environments. When tracking 4.4 million topics against 52 million documents in constant time and space, we demonstrate that counter to expectations, simple single-pass clustering can outperform locality sensitive hashing for nearest neighbour search on streams.
CITATION STYLE
Wurzer, D., Lavrenko, V., & Osborne, M. (2015). Tracking unbounded topic streams. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 1765–1773). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-1170
Mendeley helps you to discover research relevant for your work.