GMT: Goal-Conditioned Multimodal Transformer for 6-DOF Object Trajectory Synthesis in 3D Scenes

Pretrained checkpoint for GMT trained on the ADT dataset.


Paper	arXiv 2603.17993
Project page	https://huajian-zeng.github.io/projects/gmt/
Code	https://github.com/huajian-zeng/gmt
Video	YouTube
Venue	3DV 2026

Files

File	Description
`adt.pth`	PyTorch checkpoint (`{'model_state_dict': ...}`) trained on ADT
`val_sequences.txt`	List of ADT sequences used for validation

Usage

Clone the GMT repo and follow its installation instructions, then:

from huggingface_hub import hf_hub_download
import torch

ckpt_path = hf_hub_download(
    repo_id="huajian-zeng/gmt-adt",
    filename="adt.pth",
)
checkpoint = torch.load(ckpt_path, map_location="cpu")
# Build the model exactly as in scripts/eval.py and then:
# model.load_state_dict(checkpoint["model_state_dict"])

Or for an end-to-end demo:

git clone https://github.com/huajian-zeng/gmt
cd gmt
bash scripts/download_pretrained.sh   # now pulls from this HF repo
bash demo.sh

The processed ADT trajectory cache used by demo.sh lives at huajian-zeng/gmt-adt-cache.

Citation

@inproceedings{zeng2026gmt,
  title     = {{GMT}: Goal-Conditioned Multimodal Transformer for 6-DOF Object Trajectory Synthesis in 3D Scenes},
  author    = {Zeng, Huajian and Saroha, Abhishek and Cremers, Daniel and Wang, Xi},
  booktitle = {International Conference on 3D Vision (3DV)},
  year      = {2026},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including huajian-zeng/gmt-adt

GMT (3DV 2026) - Models & Datasets

Collection

Pretrained checkpoint and preprocessed ADT trajectory cache for GMT (3DV 2026). • 2 items • Updated 19 days ago

Paper for huajian-zeng/gmt-adt

GMT: Goal-Conditioned Multimodal Transformer for 6-DOF Object Trajectory Synthesis in 3D Scenes

Paper • 2603.17993 • Published Mar 18