In this paper, a data perturbation method for minimizing the possibility of disclosure of participants' identities on a survey is described in the context of the National Assessment of Educational Progress (NAEP). The method distinguishes itself from most approaches because of the presence of cognitive tasks. Hence, a data edit should have minimal impact on both relations among demographic variables and relations between demographic and proficiency variables. Furthermore, since only a few students are at risk to be disclosed in a typical sampling setting common to educational surveys, the proposed data perturbation is governed by a nonuniform probabilistic process. The method is applied to data from NAEP and impact is computed using proficiency averages, demographic proportions, statistical inference results, and loglinear models. Results show that the proposed perturbation method has very little impact on NAEP results, even at relatively large editing rates. Some data coarsening results are reported as well. While the univariate results are relatively unaffected from the coarsening, loglinear models from higher order contingency tables are affected. It is recommended to restrict disclosure limitation techniques to perturbation methods in the case of NAEP.
CITATION STYLE
Oranje, A., Freund, D., Lin, M. jang, & Tang, Y. (2007). DISCLOSURE RISK IN EDUCATIONAL SURVEYS: AN APPLICATION TO THE NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS. ETS Research Report Series, 2007(2), i–24. https://doi.org/10.1002/j.2333-8504.2007.tb02066.x
Mendeley helps you to discover research relevant for your work.