dervig commited on
Commit
be72489
·
verified ·
1 Parent(s): 0548ddb

Add deployment pointer for 96 GB Mac

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -79,6 +79,10 @@ Final HumanEval and sanity numbers will be added when the current benchmark run
79
 
80
  **5 / 5 pass**. The model is fully usable in production.
81
 
 
 
 
 
82
  ## Known minor imperfection
83
 
84
  During integrity audit of the 62-layer bias-correction tensor fix, one layer (`layer 0`) had expert keep-indices that differed slightly from the REAP-retained set (86 of 154 positions). The magnitude of the resulting bias mismatch is bounded by the layer-0 bias natural variance (`max |Δ| = 0.75` on values in `[8.06, 8.88]`), so the impact on routing is negligible — confirmed by the 5/5 smoke test above. All other 61 layers are bit-perfect. Full analysis in the [reproducibility log](https://github.com/m51ai/m51Lab-MiniMax-M2.7-REAP/blob/main/docs/research_log.md).
 
79
 
80
  **5 / 5 pass**. The model is fully usable in production.
81
 
82
+ ### Deploying on 96 GB Apple Silicon
83
+
84
+ The GGUF variants in the [companion repo](https://huggingface.co/dervig/m51Lab-MiniMax-M2.7-REAP-139B-A10B-GGUF) are the practical choice for 96 GB Mac Studio / M4 Max. That card contains an explicit memory & context sizing guide — **note that at long context, KV cache quantization (`--cache-type-k q8_0`) is essential for this architecture** (~0.25 GB of FP16 KV cache per 1K tokens across 62 layers).
85
+
86
  ## Known minor imperfection
87
 
88
  During integrity audit of the 62-layer bias-correction tensor fix, one layer (`layer 0`) had expert keep-indices that differed slightly from the REAP-retained set (86 of 154 positions). The magnitude of the resulting bias mismatch is bounded by the layer-0 bias natural variance (`max |Δ| = 0.75` on values in `[8.06, 8.88]`), so the impact on routing is negligible — confirmed by the 5/5 smoke test above. All other 61 layers are bit-perfect. Full analysis in the [reproducibility log](https://github.com/m51ai/m51Lab-MiniMax-M2.7-REAP/blob/main/docs/research_log.md).