MIRA: Memory-Integrated Reconfigurable Adapters – A Unified Framework for Settings with Multiple Tasks

Susmit Agrawal^{1,2,3,∗,†}, Krishn Vishwas Kher^1,∗, Saksham Mittal¹, Swarnim Maheshwari¹, Vineeth N. Balasubramanian^1,4

¹IIT Hyderabad ²University of Tübingen ³Tübingen AI Center ⁴Microsoft Research India
^∗Equal contribution, ^†Majority of work done at IIT Hyderabad

Paper (PDF) arXiv Code

MIRA overview diagram (associative memories + adapters)

MIRA augments a frozen backbone with Hopfield-style associative memories that store adapter weight updates and retrieve an input-conditional affine combination at inference—enabling rapid task switching and strong retention across DG, DIL, and CIL.

Abstract

We propose Memory-Integrated Reconfigurable Adapters (MIRA), a unified framework that layers Hopfield-style associative memories over a shared backbone. MIRA stores adapter weight deltas as values and learns keys post hoc to retrieve sample-wise affine combinations of adapters for any task/domain. With only objective changes across settings, MIRA jointly addresses domain generalization (DG), domain-incremental (DIL), and class-incremental (CIL) scenarios, yielding state-of-the-art or competitive results on standard benchmarks while mitigating catastrophic forgetting.

Beyond accuracy, ablations show both the necessity of explicit associative memory storage and the benefit of key refinement. The design closely mirrors neuro-inspired task modulation with a single substrate dynamically reconfigured by memory-guided adapters.

Highlights

Unified architecture for DG, DIL, and CIL with a single backbone + memories.
Associative memory retrieves per-sample adapter ensembles via learned keys.
SoTA or competitive results on PACS, VLCS, OfficeHome, DomainNet, iDigits, CORe50, ImageNet-R, DN4IL, and CDDB-hard.
Simple two-stage recipe: Adaptation (train adapters) → Consolidation (learn keys & query modules).

Method at a Glance

Each transformer layer attaches a memory unit that stores task adapters. A lightweight query module maps activations to memory queries; the memory returns an affine combination of stored adapters, which modulates the layer weights on-the-fly. Consolidation fine-tunes keys (and queries) to align retrieval with task performance—no gradients at inference.

Two Stages

Adaptation: Train LoRA-style adapters per task/domain; write adapters into layer-wise memories with placeholder keys.
Consolidation: Freeze adapters; train keys and query modules to optimize retrieval (cross-entropy) on seen data.

BibTeX

@inproceedings{agrawal2025mira,
  title     = {MIRA: Memory-Integrated Reconfigurable Adapters: A Unified Framework for Settings with Multiple Tasks},
  author    = {Agrawal, Susmit and Kher, Krishn Vishwas and Mittal, Saksham and Maheshwari, Swarnim and Balasubramanian, Vineeth N.},
  booktitle = {NeurIPS},
  year      = {2025}
}