Machine learning models may suffer from significant performance degradation when applied to data substantially different from the training data, known as out-of-distribution (OOD) data. One natural choice for unsupervised OOD detection is reconstruction-error (e.g., 3 sigma rule), which has been extensively used for anomaly detection. However, this criterion for OOD detection is problematic because reconstruction errors of some OOD instances can be similar to the training data. To address this problem, we propose a framework that integrates reconstruction errors with the theory of Local Intrinsic Dimensionality (LID) . Specifically, we introduce the use of LID to characterize the data subspaces formed by data samples and their corresponding reconstruction by autoencoders (AEs) as a feature for OOD detection, revealing their localized geometrical properties. The learning histories of a model are realizations of the underlying distance distributions of such data subspaces, the pattern of which can be captured dimensionally by LID, portraying the model learning behavior on samples. The framework incorporates reconstruction loss in combination with LID for greater robustness by providing a global measure in addition to the localized one. Extensive empirical studies validate the feasibility of using LID to characterize learning histories and demonstrate the proposed framework’s effectiveness.
CITATION STYLE
Wang, Q., Erfani, S. M., Leckie, C., & Houle, M. E. (2021). A dimensionality-driven approach for unsupervised out-of-distribution detection. In SIAM International Conference on Data Mining, SDM 2021 (pp. 118–126). Siam Society. https://doi.org/10.1137/1.9781611976700.14
Mendeley helps you to discover research relevant for your work.