Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

1Rochester Institute of Technology, 2University of Rochester

Accepted to ICML 2025

Motivation

Out-of-distribution (OOD) detection and OOD generalization are widely studied in Deep Neural Networks (DNNs), yet their relationship remains poorly understood. We empirically show that the degree of Neural Collapse (NC) in a network layer is inversely related with these objectives: stronger NC improves OOD detection but degrades generalization, while weaker NC enhances generalization at the cost of detection. This trade-off suggests that a single feature space cannot simultaneously achieve both tasks. To address this, we develop a theoretical framework linking NC to OOD detection and generalization. We show that entropy regularization mitigates NC to improve generalization, while a fixed Simplex Equiangular Tight Frame (ETF) projector enforces NC for better detection. Based on these insights, we propose a method to control NC at different DNN layers. In experiments, our method excels at both tasks across OOD datasets and DNN architectures.

Relationship between NC and OOD Detection/Generalization

Interpolation end reference image.

In this paper, we show that there is a close inverse relationship between OOD detection and generalization with respect to the degree of representation collapse in DNN layers. This plot illustrates this relationship for VGG17 pre-trained on ImageNet-100 using four OOD datasets, where we measure collapse and OOD performance for various layers. For OOD detection, there is a strong positive Pearson correlation (R = 0.77) with the degree of neural collapse (NC1) in a DNN layer, whereas for OOD generalization, there is a strong negative correlation (R=−0.60). We rigorously examine this inverse relationship and propose a method to control NC at different layers.

Controlling Neural Collapse

Interpolation end reference image.

Mitigating neural collapse (NC) in the encoder improves OOD generalization, while promoting NC in the projector enhances OOD detection. To jointly optimize both objectives, our method integrates entropy regularization and a fixed simplex ETF projector. Entropy regularization reduces NC in the encoder, thereby improving OOD generalization, whereas the ETF projector induces NC in the final layer, leading to better OOD detection.

Neural Collapse Correlates with Entropy

Interpolation end reference image.

The stronger the neural collapse (NC1), the lower the entropy and vice-versa. We analyze different layers of VGG17 networks that are pre-trained on the ImageNet-100 (ID) dataset. R denotes the Pearson correlation coefficient.

UMAP Visualization of Embedding

Interpolation end reference image.

The projector embeddings exhibit much greater NC (NC1 = 0.393) than the encoder embeddings (NC1 = 2.175) as indicated by the formation of compact clusters around class means. For clarity, we highlight 10 ImageNet classes by distinct colors. For this, we use ImageNet-100 pre-trained VGG17.

OOD Detection & Generalization Performance

We train various DNNs on ImageNet-100 (ID) and evaluate OOD generalization and detection using eight OOD datasets. Our method effectively optimizes both objectives. The encoder mitigates neural collapse (NC) and enhances OOD generalization, while the ETF projector amplifies NC and improves OOD detection. Compared to baselines, our method consistently improves both OOD detection and generalization across diverse DNN architectures.

OOD Detection

OOD Detection

OOD Generalization

OOD Generalization

UMAP Visualization of ID & OOD Data

The projector exhibits a greater separation between ID and OOD embeddings than the encoder. For clarity, we show ImageNet-10 as ID data and NINCO-64 as OOD data.

Interpolation end reference image.

Energy Score Distribution of ID & OOD Data

The projector exhibits a greater separation between ID and OOD energy scores than the encoder. For ID and OOD datasets, we show ImageNet-100 and NINCO-64, respectively.

Interpolation end reference image.

Energy Score Distribution - Flowers-102 OOD Dataset

The projector creates a greater separation between ID and OOD data and achieves a lower FPR95 than the encoder.

Interpolation end reference image.

Energy Score Distribution - STL-10 OOD Dataset

The projector better separates ID and OOD data, achieving a lower FPR95 than the encoder.

Interpolation end reference image.

Analyzing Entropy Regularization & L2 Normalization

(a) Entropy regularization reduces neural collapse (indicated by higher NC1 values) in the encoder.

(b) Entropy regularization increases the entropy of encoder embeddings otherwise entropy remains unchanged.

(c) Entropy regularization increases the effective rank of encoder embeddings otherwise effective rank remains as low as the number of classes (i.e., 10 ImageNet classes).

(d) L2 normalization increases neural collapse (indicated by lower NC1 values) in the projector.

Acknowledgements

This work was partly supported by NSF awards #2326491, #2125362, and #2317706.

BibTeX

@article{harun2025controlling,
  title     = {Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning},
  author    = {Harun, Md Yousuf and Gallardo, Jhair and Kanan, Christopher},
  journal   = {arXiv preprint arXiv:2502.10691},
  year      = {2025}
  }