Abstract
Background: Scientific data, the cornerstone of scientific endeavors, face management challenges amid technological advances. While retractions are analyzed, a rigorous focus on data problems leading to them is missing. Methods: This study collected 49,979 retraction records up to 17 December 2023. After screening 16,842 records were related to data problems and 19,656 were due to other reasons. Methods such as descriptive statistics, hypothesis testing, and the BERTopic (Bidirectional Encoder Representations from Transformers Topic Modelling) were applied to conduct a topic analysis of article titles. Result: The results show that since 2000, retractions due to data problems have increased significantly (p < 0.001), with the percentage in 2023 exceeding 75%. Among 16,842 data-related retractions, 59.0% were in Basic Life Sciences and 40.2% in Health Sciences. Data problems involve accuracy, reliability, validity, and integrity. There are significant differences (p < 0.001) in subjects, journal quartiles, retraction intervals, and other characteristics between data-related and other retractions. Data-related retractions are more concentrated in high-impact journals (Q1 37.6% and Q2 43.0%). Conclusions: Institutions, publishers, and journals should adopt image-screening tools, enforce data deposition, standardize retraction notices, provide ethics training, and strengthen peer review to address these data problems, guiding better data management and healthier scientific development.
Author supplied keywords
Cite
CITATION STYLE
Hu, W., Yan, G., Zhang, J., Chen, Z., Qian, Q., & Wu, S. (2025). Analysis of scientific paper retractions due to data problems: Revealing challenges and countermeasures in data management. Accountability in Research. https://doi.org/10.1080/08989621.2025.2531987
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.