YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ARCHON Inference Turbo

12 modules pour faire jouer ARCHON 282M comme un 7B+ à l'inférence.

Plan complet: ../docs/ARCHON_INFERENCE_TURBO_PLAN.md

Modules Phase A (drop-in, déployables immédiat post-SFT v2)

  • m7_dola.py — Contrastive decoding entre layer L6 et L18
  • m9_xgrammar.py — Wrapper logit processor structured output
  • m11_react_tools.py — ReAct loop + 70 tools registry (port NEXUS)
  • m6_snapkv.py — KV cache compression prefill phase

Modules Phase B (engine v1, instance GPU requise)

  • m1_mtp_self_spec.py — MTP-5 self-speculative decoding
  • m10_nvfp4_quant.py — NVFP4 PTQ Blackwell
  • m8_graphrag.py — HippoRAG2 federated retrieval
  • m5_prm_bestof_n.py — PRM 50M + best-of-N

Modules Phase C (training requis)

  • m2_extended_thinking.py — Thinking budget loop
  • m3_coconut.py — Latent continuous thought
  • m12_ttt_e2e.py — Test-time training LoRA

Module Phase D (architecture mod)

  • m4_mor.py — Mixture of Recursions router

Engine

  • engine.py — Chaîne complète orchestrée
  • bench.py — Benchmark vs Qwen2.5-7B, Phi-4, Llama 3.1 8B

Status code

Phase A skeletons codés 2026-06-05. Reste à brancher sur ARCHON loaded model post-SFT v2 finish.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support