Evaluation of Relational and NoSQL Approaches for Cohort Identification from Heterogeneous Data Sources in the National Sleep Research Resource

  • Zeng N
  • Zhang G
  • Li X
  • et al.
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Patient cohort discovery across heterogeneous data sources is a challenging task, which may involve a complicated process of data loading, harmonization, and querying. Most existing cohort identification tools use a relational database model implemented in SQL for storing patient data. However, SQL databases have restrictions on the maximum number of columns in a table, which necessitates the breaking down of high-dimensional data into multiple tables and affects query performance as a result. In this paper, we proposed two NoSQL-based patient cohort query systems based on an existing SQL-based system for a cross-cohort query interface for the National Sleep Resource Research (NSRR). We used eight NSRR datasets in our experiment to evaluate the performance of NoSQL-based and SQL-based systems in data loading, harmonization, and query. Our experiment showed that NoSQL-based approaches outperformed the SQL-based, and NoSQL-based systems are rather promising for developing patient cohort query systems across heterogeneous data sources.

Cite

CITATION STYLE

APA

Zeng, N., Zhang, G. Q., Li, X., & Cui, L. (2017). Evaluation of Relational and NoSQL Approaches for Cohort Identification from Heterogeneous Data Sources in the National Sleep Research Resource. Journal of Health & Medical Informatics, 08(05). https://doi.org/10.4172/2157-7420.1000295

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free