Frames2LoRA SmolVLM Checkpoints

Frames2LoRA: Parametric Video Internalization for Vision-Language Models

Official implementation of Frames2LoRA

Manan Suri  ·  Sarvesh Baskar  ·  Dinesh Manocha

University of Maryland, College Park

project page arxiv paper code

This repository contains two Frames2LoRA Stage 1 checkpoint files:

  • frames2lora-smolvlm2-500m-best-ce.pt for HuggingFaceTB/SmolVLM2-500M-Video-Instruct
  • frames2lora-smolvlm2-2.2b-best-ce.pt for HuggingFaceTB/SmolVLM2-2.2B-Instruct

Cite us

@misc{suri2026frames2loraparametricvideointernalization,
      title={Frames2LoRA: Parametric Video Internalization for Vision-Language Models},
      author={Manan Suri and Sarvesh Baskar and Dinesh Manocha},
      year={2026},
      eprint={2606.04351},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.04351},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for MananSuri27/Frames2LoRA-SmolVLM-ckpts