MISSING VALUES AND OUTLIERS IN RESEARCH DATA

  • Rashid J
  • Khalid Waheed
N/ACitations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

Key takeaway Journals should provide guidelines for reporting and dealing with missing values and outliers, and researchers should be aware of their data collection methods, correct equipment use, and data entry into software. Abstract Missing values and outliers in data may be noticed when analyzing data for research purposes. There might be some reasons for missing values i.e., multiple people entered the data in the software, software issue or the participants did not respond to or missed some questions. Outliers may be caused by errors in data collection and software entry, equipment problems, participants answering incorrectly, or it may be a true outlier. Whatever the reason is, it becomes very frustrating for the researcher to handle this issue as it can challenge the reliability of the results. Karahalios et al., reported in their review article that only 35 (43%) papers out of 82 papers included in the study, disclosed handling the missing data. There are three types of missing data in research studies i.e., missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). The cases with the missing values can be dropped out from the data as according to Schafer and Bennet dropping 5% or 10% of the data respectively does not cause biasedness in the data. Another method is that missing values can be replaced (imputed) in software by series mean, mean of nearby points, median of nearby points, linear interpolation, or linear trend of point . Each type of missing data has its way of handling and the researcher should take pain to handle these missing values so that the results of the study are not compromised. Outliers are usually identified during the analysis of normality plots such as boxplots and histograms. Outliers can be managed using a trimming technique (dropping the outlier values and analyzing the rest of the data) or the winsorization technique (replacing the outlier with nearby values) . True outliers, which cannot be overlooked, can create serious problems, and modify the study's results. Unfortunately, the literature does not help much in handling true outliers. Journals, in my view, should provide guidelines for reporting and dealing with missing values and outliers to enable researchers to disclose this type of information. Researchers, on the other hand, must be aware of their data collection methods, correct equipment use, and data entry into software.

Cite

CITATION STYLE

APA

Rashid, J., & Khalid Waheed. (2021). MISSING VALUES AND OUTLIERS IN RESEARCH DATA. Pakistan Postgraduate Medical Journal, 31(04), 167. https://doi.org/10.51642/ppmj.v31i04.404

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free