YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
ARCHON Inference Turbo
12 modules pour faire jouer ARCHON 282M comme un 7B+ à l'inférence.
Plan complet: ../docs/ARCHON_INFERENCE_TURBO_PLAN.md
Modules Phase A (drop-in, déployables immédiat post-SFT v2)
m7_dola.py— Contrastive decoding entre layer L6 et L18m9_xgrammar.py— Wrapper logit processor structured outputm11_react_tools.py— ReAct loop + 70 tools registry (port NEXUS)m6_snapkv.py— KV cache compression prefill phase
Modules Phase B (engine v1, instance GPU requise)
m1_mtp_self_spec.py— MTP-5 self-speculative decodingm10_nvfp4_quant.py— NVFP4 PTQ Blackwellm8_graphrag.py— HippoRAG2 federated retrievalm5_prm_bestof_n.py— PRM 50M + best-of-N
Modules Phase C (training requis)
m2_extended_thinking.py— Thinking budget loopm3_coconut.py— Latent continuous thoughtm12_ttt_e2e.py— Test-time training LoRA
Module Phase D (architecture mod)
m4_mor.py— Mixture of Recursions router
Engine
engine.py— Chaîne complète orchestréebench.py— Benchmark vs Qwen2.5-7B, Phi-4, Llama 3.1 8B
Status code
Phase A skeletons codés 2026-06-05. Reste à brancher sur ARCHON loaded model post-SFT v2 finish.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support