Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Continual learning is a novel learning setup for an environment where data are introduced sequentially, and a model continually learns new tasks. However, the model forgets the learned knowledge as it learns new classes. There is an approach that keeps a few previous data, but this causes other problems such as overfitting and class imbalance. In this paper, we propose a method that retrains a network with generated representations from an estimated multivariate Gaussian distribution. The representations are the vectors coming from CNN that is trained using a gradient regularization to prevent a distribution shift, allowing the stored means and covariances to create realistic representations. The generated vectors contain every class seen so far, which helps preventing the forgetting. Our 6-fold cross-validation experiment shows that the proposed method outperforms the existing continual learning methods by 1.14%p and 4.60%p in CIFAR10 and CIFAR100, respectively. Moreover, we visualize the generated vectors using t-SNE to confirm the validity of multivariate Gaussian mixture to estimate the distribution of the data representations.

Cite

CITATION STYLE

APA

Kim, T. H., Moon, H. J., & Cho, S. B. (2022). Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13756 LNCS, pp. 359–368). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21753-1_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free