SocialJax Harvest 512-code tokenizer

Tokenizer used to produce ParoleLM/socialjax-harvest-frame-aligned-512.

  • Repository: ParoleLM/socialjax-harvest-tokenizer-512
  • Training step: 5,000
  • Input: categorical SocialJax grids [B,T,16,22]
  • Window: 16 frames
  • Patch size: 2x2
  • Codes per frame: 88
  • Codebook size: 512
  • Latent dimension: 32
  • Exported inference checkpoint size: 270.7 MB

tokenizer.pt contains the exact model_state_dict used for the published dataset, without optimizer state. config.json and the relevant source files are included for reproducible loading.

import json
import torch

from src.architectures.genie.tokenizers.video_tokeniser import VideoTokenizer

config = json.load(open("config.json"))
model = VideoTokenizer(**config["model"], window=config["data"]["window"])
checkpoint = torch.load("tokenizer.pt", map_location="cpu")
model.load_state_dict(checkpoint["model_state_dict"])
model.eval()
Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support