Chengheng commited on
Commit
a0d501e
·
verified ·
1 Parent(s): 31868c9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: pytorch
4
+ tags:
5
+ - coconut
6
+ - latent-chain-of-thought
7
+ - sparse-autoencoder
8
+ - qwen3
9
+ ---
10
+
11
+ # SAE on Coconut latent thoughts (Qwen3-1.7B)
12
+
13
+ SAE bottleneck trained on the **frozen** Stage-3 Coconut latent thoughts of Qwen3-1.7B,
14
+ from the *VQ-CoT: Discretising Latent Chain-of-Thought* project (team RateLimit Achieved, EPFL CS-552).
15
+ The language model is frozen; only this bottleneck is trained on the dumped latents
16
+ (385,620 x 6 latent thought vectors, GSM8K).
17
+
18
+ - Checkpoint: `sae.pt`
19
+ - TopK SAE, dictionary 8x hidden, k=32 active features
20
+ - Result: inserting this bottleneck into the frozen latent loop costs ~0 GSM8K test accuracy.