SAGE Topology Policy — Local (Qwen3-4B + LoRA)

Trained locally on RTX 3500 Ada 12GB for the YGN-SAGE multi-agent orchestration framework.

What it does

Generates multi-agent DAG topologies in <tool_call> JSON format. Given a coding task, the model decides:

How many agent nodes (1-7)
What role per node (coder, reviewer, planner, synthesizer...)
What model tier (budget, fast, balanced, reasoner, codex)
How to connect them (message, control, state edges)
Where to place adaptation checkpoints

Format

The model outputs <tool_call> JSON (Qwen3 native format):

<tool_call>
{"name": "create_topology", "arguments": {
  "difficulty": "moderate",
  "reasoning": "Multi-step code task needs coder + reviewer",
  "nodes": [
    {"role": "coder", "model_tier": "codex", "prompt": "..."},
    {"role": "reviewer", "model_tier": "fast", "prompt": "..."}
  ],
  "edges": [{"from_idx": 0, "to_idx": 1, "flow_type": "message"}]
}}
</tool_call>

2 SAGE Tools

create_topology — Design multi-agent DAG (Phase A/B)
adapt_topology — Runtime adaptation decisions: continue/upgrade/reroute (Phase C)

Results

Metric	YAML Baseline	Tool-Call SFT
N1 avg reward	0.391	0.865 (+121%)
N1 max reward	0.987	1.024
P(reward > 0.3)	26%	90%
Simple	0.567	0.780
Moderate	0.441	0.949
Complex	0.148	0.837 (+567%)
SFT loss	0.92	0.225 (4.1x better)

Training Details

Base model: Qwen/Qwen3-4B (4-bit NF4 quantization)
Method: LoRA (rank 32, alpha 64) via TRL SFTTrainer
Data: 1880 tool-call topologies from GPT-5.4 distillation
Format: <tool_call> JSON with 2 SAGE tool definitions in system prompt
Hardware: RTX 3500 Ada 12GB, Windows, ~65 min training
Epochs: 2, lr=2e-5, cosine schedule

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4",
                         bnb_4bit_compute_dtype=torch.bfloat16)
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B",
    quantization_config=bnb, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B")
model = PeftModel.from_pretrained(model, "yannabadie/sage-topology-policy-local")

messages = [
    {"role": "system", "content": "<system prompt with tool definitions>"},
    {"role": "user", "content": "Write a function that sorts a list using merge sort."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=0.7, do_sample=True)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:]))

Training Data

Included in this repo:

topology_sft_v2_toolcall.jsonl — 1880 SFT examples
adapt_decisions_toolcall.jsonl — 5139 adaptation decisions for Phase C
expert_topologies.jsonl — 8 Claude Opus 4.6 distilled examples
sage_tool_schemas.py — Tool definitions + system prompt

Part of YGN-SAGE

GitHub | PyPI | MIT License

Downloads last month: 9

Model tree for yannabadie/sage-topology-policy-local

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Adapter

(1048)

this model