GRASP: A Rehearsal Policy for Efficient Online Continual Learning

Continual learning (CL) in deep neural networks (DNNs) involves incrementally accumulating knowledge in a DNN from a growing data stream. A major challenge in CL is that non-stationary data streams cause catastrophic forgetting of previously learned abilities. A popular solution is rehearsal: storing past observations in a buffer and then sampling the buffer to update the DNN. Uniform sampling in a class-balanced manner is highly effective, and better sample selection policies have been elusive. Here, we propose a new sample selection or rehearsal policy called GRASP (GRAdually Select less Prototypical) that selects the most prototypical (easy) samples first and then gradually selects less prototypical (harder) samples. GRASP has little additional compute or memory overhead compared to uniform selection, enabling it to scale to large datasets. Compared to 17 other rehearsal policies, GRASP achieves higher accuracy in CL experiments on ImageNet. Compared to uniform balanced sampling, GRASP achieves the same performance with 40% fewer updates. We also show that GRASP is effective for CL on five text classification datasets. GRASP has potential to supplant expensive periodic retraining and make on-device CL more efficient.

Compute Comparison

GRASP achieves the best accuracy of the uniform balanced policy while requiring 40% fewer gradient descent updates for class incremental learning with SIESTA on ImageNet-1K

GRASP matches the best accuracy of the uniform balanced policy while requiring 36% less training time for class incremental learning with SIESTA on ImageNet-1K

Class Incremental Learning

GRASP outperforms uniform balanced rehearsal in class incremental learning on ImageNet-1K using SIESTA for both veridical and latent rehearsal settings.

GRASP outperforms uniform balanced in continual IID (independent and identically distributed) experiments on ImageNet-1K using SIESTA and latent rehearsal.

Storage Constraints Analysis

GRASP outperforms other rehearsal policies under varied storage constraints in class incremental learning on ImageNet-300 using SIESTA and latent rehearsal.

GRASP surpasses compared methods under varied compute constraints in class incremental learning on ImageNet-300 using SIESTA and latent rehearsal.

GRASP Works Well With Various CL Methods

GRASP outperforms uniform balanced rehearsal policy when integrated with various rehearsal-based continual learning methods.

GRASP surpasses uniform balanced policy for both latent and veridical rehearsal with or without buffer constraints when integrated with SIESTA algorithm.

Acknowledgements

This work was supported in part by NSF awards #1909696, #2326491, and #2125362.

News

GRASP has been accepted at the Conference on Lifelong Learning Agents (CoLLAs), 2024 🎉

BibTeX

@article{harun2023grasp,
  title     = {GRASP: A Rehearsal Policy for Efficient Online Continual Learning},
  author    = {Harun, Md Yousuf and Gallardo, Jhair and Chen, Junyu and Kanan, Christopher},
  journal   = {arXiv preprint arXiv:2308.13646},
  year      = {2023}
  }

GRASP: A Rehearsal Policy for Efficient Online Continual Learning

Motivation

What Is A Rehearsal Policy?

A rehearsal policy governs construction of mini-batches for updating a deep neural network (DNN) during rehearsal. Goal: Efficiently train a DNN with fewer network updates to reach maximum performance.

Compute Comparison

Time Comparison

How Efficient Is GRASP?

Current rehearsal policies are computationally expensive and challenging to scale, whereas GRASP offers significantly greater efficiency and scalability. Unlike GRASP, existing SOTA methods require significantly longer training time than uniform.

GRASP Outperforms Existing SOTA

GRASP outperforms 17 other rehearsal policies in class incremental learning on ImageNet-300 using SIESTA with latent rehearsal. Here, we illustrate the qualitative comparison between GRASP and other high-performing policies.

Class Incremental Learning

Continual IID Learning

Storage Constraints Analysis

Compute Constraints Analysis

GRASP Works Well With Various CL Methods

GRASP Shows Efficacy For Various Settings

How Does GRASP Work?

Why Is GRASP More Effective?

Acknowledgements

News

BibTeX