Design Considerations for a Sustainable Scholarly Big Data Service

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

The advancement of web programming techniques, such as Ajax and jQuery, and datastores, such as Apache Solr and Elasticsearch, have made it much easier to deploy small to medium scale web-based search engines. However, developing a sustainable search engine that supports scholarly big data services is still challenging often because of limited human resources and financial support. Such scenarios are typical in academic settings or small businesses. Here, we showcase how four key design decisions were made by trading-off competing factors such as performance, cost, and efficiency, when developing the Next Generation CiteSeerX (NGX), the successor of CiteSeerX, which was a pioneering digital library search engine that has been serving academic communities for more than two decades. This work extends our previous work in Wu et al. (2021) and discusses design considerations of infrastructure, web applications, indexing, and document filtering. These design considerations can be generalized to other web-based search engines with a similar scale that are deployed in small business or academic settings with limited resources.

Cite

CITATION STYLE

APA

Wu, J., Rohatgi, S., Angadi, M. K., Puranik, K. S., & Giles, C. L. (2022). Design Considerations for a Sustainable Scholarly Big Data Service. In ACM International Conference Proceeding Series (pp. 83–87). Association for Computing Machinery. https://doi.org/10.1145/3574318.3574340

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free