---
license: mit
library_name: pytorch
tags:
- coconut
- latent-chain-of-thought
- vq-vae
- qwen3
---

# VQ-VAE on Coconut latent thoughts (Qwen3-0.6B)

VQ-VAE bottleneck trained on the **frozen** Stage-3 Coconut latent thoughts of Qwen3-0.6B,
from the *VQ-CoT: Discretising Latent Chain-of-Thought* project (team RateLimit Achieved, EPFL CS-552).
The language model is frozen; only this bottleneck is trained on the dumped latents
(385,620 x 6 latent thought vectors, GSM8K).

- Checkpoint: `vq.pt`
- K=4096 codes, code dim 64, EMA 0.999, k-means init + AE warmup
- Result: inserting this bottleneck into the frozen latent loop costs ~0 GSM8K test accuracy.