SAGE Topology Policy β€” Local (Qwen3-4B + LoRA)

Trained locally on RTX 3500 Ada 12GB for the YGN-SAGE multi-agent orchestration framework.

What it does

Generates multi-agent DAG topologies in <tool_call> JSON format. Given a coding task, the model decides:

  • How many agent nodes (1-7)
  • What role per node (coder, reviewer, planner, synthesizer...)
  • What model tier (budget, fast, balanced, reasoner, codex)
  • How to connect them (message, control, state edges)
  • Where to place adaptation checkpoints

Format

The model outputs <tool_call> JSON (Qwen3 native format):

<tool_call>
{"name": "create_topology", "arguments": {
  "difficulty": "moderate",
  "reasoning": "Multi-step code task needs coder + reviewer",
  "nodes": [
    {"role": "coder", "model_tier": "codex", "prompt": "..."},
    {"role": "reviewer", "model_tier": "fast", "prompt": "..."}
  ],
  "edges": [{"from_idx": 0, "to_idx": 1, "flow_type": "message"}]
}}
</tool_call>

2 SAGE Tools

  1. create_topology β€” Design multi-agent DAG (Phase A/B)
  2. adapt_topology β€” Runtime adaptation decisions: continue/upgrade/reroute (Phase C)

Results

Metric YAML Baseline Tool-Call SFT
N1 avg reward 0.391 0.865 (+121%)
N1 max reward 0.987 1.024
P(reward > 0.3) 26% 90%
Simple 0.567 0.780
Moderate 0.441 0.949
Complex 0.148 0.837 (+567%)
SFT loss 0.92 0.225 (4.1x better)

Training Details

  • Base model: Qwen/Qwen3-4B (4-bit NF4 quantization)
  • Method: LoRA (rank 32, alpha 64) via TRL SFTTrainer
  • Data: 1880 tool-call topologies from GPT-5.4 distillation
  • Format: <tool_call> JSON with 2 SAGE tool definitions in system prompt
  • Hardware: RTX 3500 Ada 12GB, Windows, ~65 min training
  • Epochs: 2, lr=2e-5, cosine schedule

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
                         bnb_4bit_compute_dtype=torch.bfloat16)
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B",
    quantization_config=bnb, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B")
model = PeftModel.from_pretrained(model, "yannabadie/sage-topology-policy-local")

messages = [
    {"role": "system", "content": "<system prompt with tool definitions>"},
    {"role": "user", "content": "Write a function that sorts a list using merge sort."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.7, do_sample=True)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:]))

Training Data

Included in this repo:

  • topology_sft_v2_toolcall.jsonl β€” 1880 SFT examples
  • adapt_decisions_toolcall.jsonl β€” 5139 adaptation decisions for Phase C
  • expert_topologies.jsonl β€” 8 Claude Opus 4.6 distilled examples
  • sage_tool_schemas.py β€” Tool definitions + system prompt

Part of YGN-SAGE

GitHub | PyPI | MIT License

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for yannabadie/sage-topology-policy-local

Finetuned
Qwen/Qwen3-4B
Adapter
(1048)
this model