File size: 656 Bytes
a0d501e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ---
license: mit
library_name: pytorch
tags:
- coconut
- latent-chain-of-thought
- sparse-autoencoder
- qwen3
---
# SAE on Coconut latent thoughts (Qwen3-1.7B)
SAE bottleneck trained on the **frozen** Stage-3 Coconut latent thoughts of Qwen3-1.7B,
from the *VQ-CoT: Discretising Latent Chain-of-Thought* project (team RateLimit Achieved, EPFL CS-552).
The language model is frozen; only this bottleneck is trained on the dumped latents
(385,620 x 6 latent thought vectors, GSM8K).
- Checkpoint: `sae.pt`
- TopK SAE, dictionary 8x hidden, k=32 active features
- Result: inserting this bottleneck into the frozen latent loop costs ~0 GSM8K test accuracy.
|