About Me

Hello! I am a 5th year PhD student in the Chester F. Carlson Center for Imaging Science at the Rochester Institute of Technology (RIT) in Rochester, NY. I currently work in the kLab under the supervision of Dr. Christopher Kanan. My current research focuses on deep learning with an emphasis on continual/ lifelong machine learning. My works have been published in NeurIPS, TMLR, CVPRW, and CoLLAs that advance the state-of-the-art in continual learning and deep learning.

I received an MS in Electrical Engineering from University of Hawaii and a BS in Electrical Engineering from Khulna University of Engineering and Technology. During MS, I worked on deep learning applied to medical imaging. My prior works have been published in IEEE NanoMed, ASRM conferences, and Reproductive BioMedicine Journal.

🔮 You can find my CV here.

🔮 Here is my

Latest News

Sep 2024: Our paper "What Variables Affect Out-Of-Distribution Generalization in Pretrained Models?" got accepted at NeurIPS 2024! 🔥
Sep 2024: Our paper "Overcoming the Stability Gap in Continual Learning" got accepted at Transactions on Machine Learning Research (TMLR) 2024! 🎉
April 2024: Our paper "GRASP: A Rehearsal Policy for Efficient Online Continual Learning" got accepted at the Conference on Lifelong Learning Agents (CoLLAs) 2024! 🎉
Nov 2023: Our paper "SIESTA: Efficient Online Continual Learning with Sleep" got accepted at Transactions on Machine Learning Research (TMLR) 2023! 🎉
Nov 2023: Won the "Best Student Abstract Award" at IEEE Western New York Image & Signal Processing Workshop 2023. 🎉
Oct 2023: Gave an invited talk on "Towards Efficient Continual Learning in Deep Neural Networks" at RIT Center for Human-aware Artificial Intelligence (CHAI) Seminar Series.
April 2023: Our paper "How Efficient Are Today's Continual Learning Algorithms?" got accepted in the CLVision Workshop at CVPR 2023! 🎉
Aug 2020: Got admitted to the Rochester Institute of Technology Imaging Science Ph.D. program! 😊
May 2020: Successfully obtained my MS in Electrical Engineering from University of Hawaii. 🎓
May 2016: Successfully obtained my BS in Electrical & Electronic Engineering from Khulna University of Engineering & Technology in Khulna, Bangladesh. 🎓


What Variables Affect Out-Of-Distribution Generalization in Pretrained Models?

Md Yousuf Harun, Kyungbok Lee, Jhair Gallardo, Giri Krishnan, Christopher Kanan

Embeddings produced by pre-trained deep neural networks (DNNs) are widely used; however, their efficacy for downstream tasks can vary widely. We study the factors influencing out-of-distribution (OOD) generalization of pre-trained DNN embeddings through the lens of the tunnel effect hypothesis, which suggests deeper DNN layers compress representations and hinder OOD performance. Contrary to earlier work, we find the tunnel effect is not universal. Our results emphasize the danger of generalizing findings from toy datasets to broader contexts.

NeurIPS 2024

GRASP: A Rehearsal Policy for Efficient Online Continual Learning

Md Yousuf Harun, Jhair Gallardo, Junyu Chen, Christopher Kanan

In this work, we propose a new sample selection or rehearsal policy called GRASP (GRAdually Select less Prototypical) for efficient continual learning (CL). GRASP is a dynamic rehearsal policy that progressively selects harder samples over time to efficiently update deep neural networks on large-scale data streams in CL settings. GRASP is the first method to outperform uniform balanced sampling in both large-scale vision and NLP datasets. GRASP has potential to supplant expensive periodic retraining and make on-device CL more efficient.

CoLLAs 2024

Overcoming the Stability Gap in Continual Learning

Md Yousuf Harun, Christopher Kanan

Pre-trained deep neural networks (DNNs) are being widely deployed by industry for making business decisions and to serve users; however, a major problem is model decay. To mitigate model decay, DNNs are retrained from scratch which is computationally expensive. In this work, we study how continual learning could overcome model decay and reduce computational costs. We identify the stability gap as a major obstacle in our setting. We study how to mitigate the stability gap and test a variety of hypotheses. This leads us to discover a method that vastly reduces the stability gap and greatly increases computational efficiency.

TMLR 2024

SIESTA: Efficient Online Continual Learning with Sleep

Md Yousuf Harun, Jhair Gallardo, Tyler L. Hayes, Ronald Kemker, Christopher Kanan

For continual learning (CL) to make a real-world impact, CL systems need to provide computational efficiency and rival traditional offline learning systems retrained from scratch. Towards that goal, we propose a novel online CL algorithm named SIESTA. SIESTA uses a wake/sleep framework for training, which is well aligned to the needs of on-device learning. SIESTA is far more computationally efficient than existing methods, enabling CL on ImageNet-1K in under 2 hours; moreover, it achieves "zero forgetting" by matching the performance of the joint model (upper bound), a milestone critical to driving adoption of CL in real-world applications.

TMLR 2023

How Efficient Are Today's Continual Learning Algorithms?

Md Yousuf Harun, Jhair Gallardo, Tyler L. Hayes, Christopher Kanan

Continual learning (CL) has focused on catastrophic forgetting, but a major motivation for CL is efficiently updating deep neural networks (DNNs) with new data, rather than retraining from scratch when dataset grows over time. We study the computational efficiency of existing CL methods which reveals that many are as expensive as training offline models from scratch. This defeats the efficiency aspect of CL.

CVPR-W 2023



  • S. Srivastava, M.Y. Harun, R. Shrestha, C. Kanan. Improving Multimodal Large Language Models Using Continual Learning.

Peer-Reviewed Papers

