Synthetic data generation is critical in machine and deep learning research to overcome the shortage of samples or dataset sizes. Various algorithms, including the generative adversarial network and autoencoder models, have been applied to generate artificial datasets in previous studies. In this study, we propose a synthetic data generation framework for a tabular dataset collected from cognitive psychology behavioral experiments based on deep learning algorithms. Tabular datasets for the Stroop task were used to develop our framework. On account of the relatively small sample size (N=102) of the dataset used in our study, we used a pre-trained generative adversarial network model to complement the size of the dataset. Furthermore, we proposed and applied five evaluation methods with statistical tests (overlapped sample test, constraint reflection test, correlation reflection test, distribution distance test, and feature distance test) to validate generation performance based on internal levels of table structure (instance level, feature level, and whole-set level evaluations). The proposed framework with a fine-tuned generative adversarial network algorithm was compared with a random generation method to verify generation performance, including the representation of the statistical characteristics of the original datasets. We found that the generated datasets from the proposed framework exhibited more similar statistical characteristics with the original dataset than the randomly generated datasets based on five evaluation methods. The results of this study provide not only generation algorithms for cognitive psychological datasets with tabular type but also a solution to the sample size issue for researchers.
CITATION STYLE
Choi, J. G., Nah, Y., Ko, I., & Han, S. (2021). Deep Learning Approach to Generate a Synthetic Cognitive Psychology Behavioral Dataset. IEEE Access, 9, 142489–142505. https://doi.org/10.1109/ACCESS.2021.3120083
Mendeley helps you to discover research relevant for your work.