Benchmarking fast-data platforms for the aadhaar biometric database

1Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Aadhaar is the world’s largest biometric database with a billion records, being compiled as an identity platform to deliver social services to residents of India. Aadhaar processes streams of biometric data as residents are enrolled and updated. Besides ∼1 million enrollments and updates per day, up to 100 million daily biometric authentications are expected during delivery of various public services. These form critical Big Data applications, with large volumes and high velocity of data. Here, we propose a stream processing workload, based on the Aadhaar enrollment and Authentication applications, as a Big Data benchmark for distributed stream processing systems. We describe the application composition, and characterize their task latencies and selectivity, and data rate and size distributions, based on real observations. We also validate this benchmark on Apache Storm using synthetic streams and simulated application logic. This paper offers a unique glimpse into an operational national identity infrastructure, and proposes a benchmark for “fast data” platforms to support such eGovernance applications.

Cite

CITATION STYLE

APA

Simmhan, Y., Shukla, A., & Verma, A. (2016). Benchmarking fast-data platforms for the aadhaar biometric database. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10044, pp. 21–39). Springer Verlag. https://doi.org/10.1007/978-3-319-49748-8_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free