Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning

Tae Heon Kim; Hyung Jun Moon; Sung Bae Cho

Conference Proceedings

Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13756 LNCS 359-368

DOI: 10.1007/978-3-031-21753-1_35

0Citations

1Readers

Get full text

Abstract

Continual learning is a novel learning setup for an environment where data are introduced sequentially, and a model continually learns new tasks. However, the model forgets the learned knowledge as it learns new classes. There is an approach that keeps a few previous data, but this causes other problems such as overfitting and class imbalance. In this paper, we propose a method that retrains a network with generated representations from an estimated multivariate Gaussian distribution. The representations are the vectors coming from CNN that is trained using a gradient regularization to prevent a distribution shift, allowing the stored means and covariances to create realistic representations. The generated vectors contain every class seen so far, which helps preventing the forgetting. Our 6-fold cross-validation experiment shows that the proposed method outperforms the existing continual learning methods by 1.14%p and 4.60%p in CIFAR10 and CIFAR100, respectively. Moreover, we visualize the generated vectors using t-SNE to confirm the validity of multivariate Gaussian mixture to estimate the distribution of the data representations.

Author supplied keywords

Cite

CITATION STYLE

APA

Kim, T. H., Moon, H. J., & Cho, S. B. (2022). Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13756 LNCS, pp. 359–368). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21753-1_35

Gradient Regularization with Multivariate Distribution of Previous Knowledge for Continual Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions