GMT: Goal-Conditioned Multimodal Transformer for 6-DOF Object Trajectory Synthesis in 3D Scenes

Pretrained checkpoint for GMT trained on the ADT dataset.

Files

File Description
adt.pth PyTorch checkpoint ({'model_state_dict': ...}) trained on ADT
val_sequences.txt List of ADT sequences used for validation

Usage

Clone the GMT repo and follow its installation instructions, then:

from huggingface_hub import hf_hub_download
import torch

ckpt_path = hf_hub_download(
    repo_id="huajian-zeng/gmt-adt",
    filename="adt.pth",
)
checkpoint = torch.load(ckpt_path, map_location="cpu")
# Build the model exactly as in scripts/eval.py and then:
# model.load_state_dict(checkpoint["model_state_dict"])

Or for an end-to-end demo:

git clone https://github.com/huajian-zeng/gmt
cd gmt
bash scripts/download_pretrained.sh   # now pulls from this HF repo
bash demo.sh

The processed ADT trajectory cache used by demo.sh lives at huajian-zeng/gmt-adt-cache.

Citation

@inproceedings{zeng2026gmt,
  title     = {{GMT}: Goal-Conditioned Multimodal Transformer for 6-DOF Object Trajectory Synthesis in 3D Scenes},
  author    = {Zeng, Huajian and Saroha, Abhishek and Cremers, Daniel and Wang, Xi},
  booktitle = {International Conference on 3D Vision (3DV)},
  year      = {2026},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including huajian-zeng/gmt-adt

Paper for huajian-zeng/gmt-adt