Abstract
Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer’s disease (AD) cohort studies, an automatic tool has yet to be developed. Objective: As cohort studies form the basis of data-driven analysis, harmonizing them is crucial for cross-cohort analysis. We aimed to accelerate this task by constructing an automatic harmonization tool. Methods: We created a common data model (CDM) through cross-mapping data from 20 cohorts, three CDMs, and ontology terms, which was then used to fine-tune a BioBERT model. Finally, we evaluated the model using three previously unseen cohorts and compared its performance to a string-matching baseline model. Results: Here, we present our AD-Mapper interface for automatic harmonization of AD cohort studies, which outperformed a string-matching baseline on previously unseen cohort studies. We showcase our CDM comprising 1218 unique variables. Conclusion: AD-Mapper leverages semantic similarities in naming conventions across cohorts to improve mapping performance.
Author supplied keywords
Cite
CITATION STYLE
Wegner, P., Balabin, H., Ay, M. C., Bauermeister, S., Killin, L., Gallacher, J., … Salimi, Y. (2024). Semantic Harmonization of Alzheimer’s Disease Datasets Using AD-Mapper. Journal of Alzheimer’s Disease, 99(4), 1409–1423. https://doi.org/10.3233/JAD-240116
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.