ABOUT ME

Hello! I am a 5th year PhD student in the Chester F. Carlson Center for Imaging Science at the Rochester Institute of Technology (RIT) in Rochester, NY. I currently work in the kLab under the supervision of Dr. Christopher Kanan. My current research focuses on deep learning with an emphasis on continual/ lifelong machine learning. My works have been published in NeurIPS, ICML, TMLR, CVPRW, and CoLLAs that advance the state-of-the-art in deep learning and continual learning.

I received an MS in Electrical Engineering from University of Hawaii and a BS in Electrical Engineering from Khulna University of Engineering and Technology. During MS, I worked on deep learning applied to medical imaging. My prior works have been published in IEEE NanoMed, ASRM conferences, and Reproductive BioMedicine Journal.

🔮 You can find my CV here.

🔮 Here is my

NEWS

Jun 2025: Our paper "A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization" has been accepted to CoLLAs 2025! 🎉
Jun 2025: Our paper "Improving Multimodal Large Language Models Using Continual Learning" has been accepted to CoLLAs 2025! 🎉
May 2025: Our paper "Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning" was accepted at ICML 2025! 🔥
Apr 2025: Successfully defended my dissertation proposal and advanced to candidacy. 😊
Sep 2024: Our paper "What Variables Affect Out-Of-Distribution Generalization in Pretrained Models?" was accepted at NeurIPS 2024! 🔥
Sep 2024: Our paper "Overcoming the Stability Gap in Continual Learning" has been accepted to TMLR! 🎉
Apr 2024: Our paper "GRASP: A Rehearsal Policy for Efficient Online Continual Learning" was accepted at CoLLAs 2024! 🎉
Nov 2023: Our paper "SIESTA: Efficient Online Continual Learning with Sleep" has been accepted to TMLR! 🎉
Nov 2023: Won Best Student Abstract Award at IEEE Western New York Image & Signal Processing Workshop 2023. 🎉
Apr 2023: Our paper "How Efficient Are Today’s Continual Learning Algorithms?" was accepted at CLVision Workshop @ CVPR 2023! 🎉

RESEARCH

ICML 2025: Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

Md Yousuf Harun, Jhair Gallardo, Christopher Kanan

Out-of-distribution (OOD) detection and OOD generalization are widely studied in deep learning, yet their relationship remains poorly understood. We empirically show that the degree of Neural Collapse (NC) in a network layer is inversely related with these objectives: stronger NC improves OOD detection but hurts generalization, while weaker NC does the opposite. This trade-off suggests that a single feature space cannot simultaneously achieve both tasks. To address this, we develop a theoretical framework linking NC to these objectives and propose a method to control NC across layers using entropy regularization for OOD generalization and a fixed Simplex ETF projector for OOD detection.

CoLLAs 2025: A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization

Md Yousuf Harun, Christopher Kanan

Continual learning systems must efficiently learn new concepts while preserving prior knowledge. However, randomly initializing classifier weights for new categories causes instability and high initial loss, requiring prolonged training. Inspired by neural collapse, we propose a data-driven weight initialization strategy using a least-square analytical solution, aligning weights with learned features. This reduces loss spikes and accelerates adaptation.

CoLLAs 2025: Improving Multimodal Large Language Models Using Continual Learning

Shikhar Srivastava, Md Yousuf Harun, Robik Shrestha, Christopher Kanan

Generative LLMs gain multimodal capabilities by integrating pre-trained vision models, but this often leads to linguistic forgetting. We analyze this issue in LLaVA MLLM through a continual learning lens, evaluating five methods to mitigate forgetting. Our approach reduces linguistic forgetting by up to 15% while preserving multimodal accuracy. We also demonstrate its robustness in sequential vision-language tasks, maintaining linguistic skills while acquiring new multimodal abilities.

NeurIPS 2024: What Variables Affect Out-of-Distribution Generalization in Pretrained Models?

Md Yousuf Harun*, Kyungbok Lee*, Jhair Gallardo, Giri Krishnan, Christopher Kanan
[* denotes equal contribution]

Embeddings produced by pre-trained deep neural networks (DNNs) are widely used; however, their efficacy for downstream tasks can vary widely. We study the factors influencing out-of-distribution (OOD) generalization of pre-trained DNN embeddings through the lens of the tunnel effect hypothesis, which suggests deeper DNN layers compress representations and hinder OOD performance. Contrary to earlier work, we find the tunnel effect is not universal. Our results emphasize the danger of generalizing findings from toy datasets to broader contexts.

CoLLAs 2024: GRASP: A Rehearsal Policy for Efficient Online Continual Learning

Md Yousuf Harun, Jhair Gallardo, Junyu Chen, Christopher Kanan

In this work, we propose a new sample selection or rehearsal policy called GRASP (GRAdually Select less Prototypical) for efficient continual learning (CL). GRASP is a dynamic rehearsal policy that progressively selects harder samples over time to efficiently update deep neural networks on large-scale data streams in CL settings. GRASP is the first method to outperform uniform balanced sampling in both large-scale vision and NLP datasets. GRASP has potential to supplant expensive periodic retraining and make on-device CL more efficient.

TMLR 2024: Overcoming the Stability Gap in Continual Learning

Md Yousuf Harun, Christopher Kanan

Pre-trained deep neural networks (DNNs) are being widely deployed by industry for making business decisions and to serve users; however, a major problem is model decay. To mitigate model decay, DNNs are retrained from scratch which is computationally expensive. In this work, we study how continual learning could overcome model decay and reduce computational costs. We identify the stability gap as a major obstacle in our setting. We study how to mitigate the stability gap and test a variety of hypotheses. This leads us to discover a method that vastly reduces the stability gap and greatly increases computational efficiency.

TMLR 2023: SIESTA: Efficient Online Continual Learning with Sleep

Md Yousuf Harun*, Jhair Gallardo*, Tyler L. Hayes, Ronald Kemker, Christopher Kanan
[* denotes equal contribution, also presented at the Journal Track of CoLLAs 2024]

For continual learning (CL) to make a real-world impact, CL systems need to provide computational efficiency and rival traditional offline learning systems retrained from scratch. Towards that goal, we propose a novel online CL algorithm named SIESTA. SIESTA uses a wake/sleep framework for training, which is well aligned to the needs of on-device learning. SIESTA is far more computationally efficient than existing methods, enabling CL on ImageNet-1K in under 2 hours; moreover, it achieves "zero forgetting" by matching the performance of the joint model, a milestone critical to driving adoption of CL in real-world applications.

CVPRW 2023: How Efficient Are Today's Continual Learning Algorithms?

Md Yousuf Harun, Jhair Gallardo, Tyler L. Hayes, Christopher Kanan

Continual learning (CL) has focused on catastrophic forgetting, but a major motivation for CL is efficiently updating deep neural networks (DNNs) with new data, rather than retraining from scratch when dataset grows over time. We study the computational efficiency of existing CL methods which reveals that many are as expensive as training offline models from scratch. This defeats the efficiency aspect of CL.

PUBLICATIONS

Peer-Reviewed Papers

M.Y. Harun, J. Gallardo, C. Kanan. Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning. In: International Conference on Machine Learning (ICML), 2025 [26.9% accept rate]
M.Y. Harun, C. Kanan. A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization. In: Conference on Lifelong Learning Agents (CoLLAs), 2025
S. Srivastava, M.Y. Harun, R. Shrestha, C. Kanan. Improving Multimodal Large Language Models Using Continual Learning. In: Conference on Lifelong Learning Agents (CoLLAs), 2025
M.Y. Harun, K. Lee, J. Gallardo, G. Krishnan, C. Kanan. What Variables Affect Out-Of-Distribution Generalization in Pretrained Models? In: Neural Information Processing Systems (NeurIPS), 2024 [25.8% accept rate]
M.Y. Harun, C. Kanan. Overcoming the Stability Gap in Continual Learning . In: Transactions on Machine Learning Research (TMLR), 2024
M.Y. Harun, J. Gallardo, J. Chen, C. Kanan. GRASP: A Rehearsal Policy for Efficient Online Continual Learning . In: Conference on Lifelong Learning Agents (CoLLAs), 2024
M.Y. Harun, J. Gallardo, T.L. Hayes, R. Kemker, C. Kanan. SIESTA: Efficient Online Continual Learning with Sleep . In: Transactions on Machine Learning Research (TMLR), 2023
M.Y. Harun, J. Gallardo, T.L. Hayes, C. Kanan. How Efficient Are Today's Continual Learning Algorithms? . In: CVPR Continual Learning in Computer Vision Workshop (CVPR-CLVision), 2023
T.T. Huang, T. Kosasa, B. Walker, C. Arnett, C.T. Huang, C. Yin, M.Y. Harun, H.J. Ahn, A. Ohta. Deep Learning Neural Network Analysis of Human Blastocyst Analysis from Time-lapse Image Files . In: Reproductive BioMedicine Online (RBMO), 2021
M.Y. Harun, M.A. Rahman, J. Mellinger, W. Chang, T. Huang, B. Walker, K. Hori, A. Ohta. Image Segmentation of Zona-Ablated Human Blastocysts . In: Proc. IEEE Intl. Conference on Nano/Molecular Medicine & Engineering (NANOMED), 2019
M.Y. Harun, T. Huang, B. Walker, A. Ohta. Inner Cell Mass and Trophectoderm Segmentation in Human Blastocyst Images using Deep Neural Network . In: Proc. IEEE Intl. Conference on Nano/Molecular Medicine & Engineering (NANOMED), 2019
T. Huang, B. Walker, M.Y. Harun, M.A. Rahman, J. Mellinger, W. Chang, A. Ohta. Automated Computer Analysis of Human Blastocyst Expansion from Embryoscope Time-Lapse Image Files . In: American Society for Reproductive Medicine (ASRM), 2019

Dissertation

M.Y. Harun. Medical Image Segmentation for Embryo Image Analysis . MS Dissertation, University of Hawaii, 2018