---
license: apache-2.0
language:
- en
tags:
- tool-calling
- function-calling
- agent
- edge-ai
- mobile
- GRPO
- agentic
- json-generation
- sub-1B
- qwen
- ondemand
base_model: Qwen/Qwen3.5-0.8B
pipeline_tag: text-generation
datasets:
- airev-ae/qwen-agentic-json-cream
---

# AIREV Qwen-0.8B-AgentJSON

**World's first sub-1B parameter model with functional tool calling capability.**

Built by [AIREV](https://on-demand.io) for the [OnDemand](https://on-demand.io) Agentic AI Platform.

## Key Results

| Metric | Score |
|--------|:-----:|
| JSON Validity (Easy queries) | **92%** |
| Correct Plugin Selection | **94%** |
| Exact Plugin ID Match (Easy) | **81%** |
| Production Composite Score | **75.6%** |
| Parameters | **752M** |
| Quantized Size | **~400MB** |
| Edge Inference Speed | **~30 tok/s** |

## What This Model Does

Generates structured JSON execution plans for tool/plugin orchestration. Given a user request and available tools, it produces a valid JSON object specifying which tools to call, with what parameters, and in what order.

## Evaluation Results

### Production Plugin Eval (Real OnDemand Plugins, 50 samples)

| Metric | Model | Base Qwen 0.8B | Improvement |
|--------|:-----:|:--------------:|:-----------:|
| Valid JSON | **94.0%** | 18.0% | +76% |
| Correct Plugin IDs | **44.0%** | 0.0% | +44% |
| Params Correct Type | **94.0%** | 0.0% | +94% |
| Param Keys Match | **66.0%** | 0.0% | +66% |
| Real Production IDs | **94.0%** | 0.0% | +94% |
| Dependencies Present | **88.0%** | 0.0% | +88% |
| **Composite** | **75.6%** | **4.8%** | **+70.8%** |

### By Query Complexity

| Difficulty | JSON Valid | Real Plugin IDs | Exact Match |
|-----------|:---------:|:---------------:|:-----------:|
| Easy (1 tool) | **92%** | **92%** | **81%** |
| Medium (2-3 tools) | **96%** | **96%** | 4% |

## Training Pipeline

This model was trained using a novel multi-stage approach developed by AIREV:

### Stage 1: Supervised Fine-Tuning (SFT)
- Base model: Qwen 3.5-0.8B
- 47K+ curated samples with reasoning traces and structured JSON outputs
- Real production plugin schemas from the OnDemand platform (2,176 plugins)
- LLM-evaluated data quality filtering (score >= 8/10)
- Full fine-tune (not LoRA) — research shows full FT outperforms at sub-1B scale

### Stage 2: Progressive Curriculum GRPO
- Group Relative Policy Optimization with a novel 4-phase progressive reward curriculum
- The reward progressively increases in difficulty across training:
  - Phase 1: JSON structural validity
  - Phase 2: Required field presence
  - Phase 3: Correct tool selection (prompt-grounded)
  - Phase 4: Parameter quality and completeness
- Innovation: Noise injection for zero-variance groups prevents common GRPO failure modes
- Prompt-aware reward function verifies selected tools against available options in the input

### Why This Matters

1. **Sub-1B tool calling is novel** — no published model under 3B demonstrates functional tool calling
2. **Progressive Curriculum GRPO** — a new approach enabling skill stacking in small models
3. **Edge deployment viable** — 400MB quantized, 30 tok/s on Snapdragon 8 Elite, <3s response time

## Model Specs

| Parameter | Value |
|-----------|-------|
| Base Model | Qwen 3.5-0.8B (752M params) |
| Training Data | 47,400 samples (real production plugins) |
| SFT Epochs | 3 |
| GRPO Steps | 1,250 |
| Precision | bf16 |
| Hardware | NVIDIA H100 80GB |
| Attention | SDPA |

## Intended Use

- **Primary:** On-device agentic AI for tool orchestration on edge devices (AR glasses, mobile, IoT)
- **Platform:** Built for the [OnDemand](https://on-demand.io) Agentic AI Platform (3,000+ tools)
- **Best for:** 1-2 step tool calling queries
- **Not recommended for:** Complex multi-step workflows (5+ tools)

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained("airev-ae/Qwen-0.8B-AgentJSON", torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("airev-ae/Qwen-0.8B-AgentJSON")

messages = [
    {"role": "system", "content": "You are an AI agent orchestrator. Generate a JSON execution plan."},
    {"role": "user", "content": "Search for the latest AI news"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=512, temperature=0.8, do_sample=True)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```

## Limitations

- Best accuracy on 1-2 tool queries; degrades on 3+ tool orchestration
- Trained on OnDemand plugin schema format
- Recommended: use with JSON validation at inference for production
- Temperature 0.7-0.8 produces best results

## Citation

```bibtex
@misc{airev2026agentjson,
  title={AIREV Qwen-0.8B-AgentJSON: Sub-1B Tool Calling via Progressive Curriculum GRPO},
  author={AIREV FZ-LLC},
  year={2026},
  url={https://huggingface.co/airev-ae/Qwen-0.8B-AgentJSON}
}
```

---

**Built by [AIREV](https://airev.ae) | [OnDemand Platform](https://on-demand.io) | Abu Dhabi, UAE**

*Trained with Progressive Curriculum GRPO — a novel approach for sub-1B structured output generation.*