--- license: apache-2.0 language: - en tags: - tool-calling - function-calling - agent - edge-ai - mobile - GRPO - agentic - json-generation - sub-1B - qwen - ondemand base_model: Qwen/Qwen3.5-0.8B pipeline_tag: text-generation datasets: - airev-ae/qwen-agentic-json-cream --- # AIREV Qwen-0.8B-AgentJSON **World's first sub-1B parameter model with functional tool calling capability.** Built by [AIREV](https://on-demand.io) for the [OnDemand](https://on-demand.io) Agentic AI Platform. ## Key Results | Metric | Score | |--------|:-----:| | JSON Validity (Easy queries) | **92%** | | Correct Plugin Selection | **94%** | | Exact Plugin ID Match (Easy) | **81%** | | Production Composite Score | **75.6%** | | Parameters | **752M** | | Quantized Size | **~400MB** | | Edge Inference Speed | **~30 tok/s** | ## What This Model Does Generates structured JSON execution plans for tool/plugin orchestration. Given a user request and available tools, it produces a valid JSON object specifying which tools to call, with what parameters, and in what order. ## Evaluation Results ### Production Plugin Eval (Real OnDemand Plugins, 50 samples) | Metric | Model | Base Qwen 0.8B | Improvement | |--------|:-----:|:--------------:|:-----------:| | Valid JSON | **94.0%** | 18.0% | +76% | | Correct Plugin IDs | **44.0%** | 0.0% | +44% | | Params Correct Type | **94.0%** | 0.0% | +94% | | Param Keys Match | **66.0%** | 0.0% | +66% | | Real Production IDs | **94.0%** | 0.0% | +94% | | Dependencies Present | **88.0%** | 0.0% | +88% | | **Composite** | **75.6%** | **4.8%** | **+70.8%** | ### By Query Complexity | Difficulty | JSON Valid | Real Plugin IDs | Exact Match | |-----------|:---------:|:---------------:|:-----------:| | Easy (1 tool) | **92%** | **92%** | **81%** | | Medium (2-3 tools) | **96%** | **96%** | 4% | ## Training Pipeline This model was trained using a novel multi-stage approach developed by AIREV: ### Stage 1: Supervised Fine-Tuning (SFT) - Base model: Qwen 3.5-0.8B - 47K+ curated samples with reasoning traces and structured JSON outputs - Real production plugin schemas from the OnDemand platform (2,176 plugins) - LLM-evaluated data quality filtering (score >= 8/10) - Full fine-tune (not LoRA) — research shows full FT outperforms at sub-1B scale ### Stage 2: Progressive Curriculum GRPO - Group Relative Policy Optimization with a novel 4-phase progressive reward curriculum - The reward progressively increases in difficulty across training: - Phase 1: JSON structural validity - Phase 2: Required field presence - Phase 3: Correct tool selection (prompt-grounded) - Phase 4: Parameter quality and completeness - Innovation: Noise injection for zero-variance groups prevents common GRPO failure modes - Prompt-aware reward function verifies selected tools against available options in the input ### Why This Matters 1. **Sub-1B tool calling is novel** — no published model under 3B demonstrates functional tool calling 2. **Progressive Curriculum GRPO** — a new approach enabling skill stacking in small models 3. **Edge deployment viable** — 400MB quantized, 30 tok/s on Snapdragon 8 Elite, <3s response time ## Model Specs | Parameter | Value | |-----------|-------| | Base Model | Qwen 3.5-0.8B (752M params) | | Training Data | 47,400 samples (real production plugins) | | SFT Epochs | 3 | | GRPO Steps | 1,250 | | Precision | bf16 | | Hardware | NVIDIA H100 80GB | | Attention | SDPA | ## Intended Use - **Primary:** On-device agentic AI for tool orchestration on edge devices (AR glasses, mobile, IoT) - **Platform:** Built for the [OnDemand](https://on-demand.io) Agentic AI Platform (3,000+ tools) - **Best for:** 1-2 step tool calling queries - **Not recommended for:** Complex multi-step workflows (5+ tools) ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained("airev-ae/Qwen-0.8B-AgentJSON", torch_dtype=torch.bfloat16) tokenizer = AutoTokenizer.from_pretrained("airev-ae/Qwen-0.8B-AgentJSON") messages = [ {"role": "system", "content": "You are an AI agent orchestrator. Generate a JSON execution plan."}, {"role": "user", "content": "Search for the latest AI news"} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) output = model.generate(**inputs, max_new_tokens=512, temperature=0.8, do_sample=True) print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)) ``` ## Limitations - Best accuracy on 1-2 tool queries; degrades on 3+ tool orchestration - Trained on OnDemand plugin schema format - Recommended: use with JSON validation at inference for production - Temperature 0.7-0.8 produces best results ## Citation ```bibtex @misc{airev2026agentjson, title={AIREV Qwen-0.8B-AgentJSON: Sub-1B Tool Calling via Progressive Curriculum GRPO}, author={AIREV FZ-LLC}, year={2026}, url={https://huggingface.co/airev-ae/Qwen-0.8B-AgentJSON} } ``` --- **Built by [AIREV](https://airev.ae) | [OnDemand Platform](https://on-demand.io) | Abu Dhabi, UAE** *Trained with Progressive Curriculum GRPO — a novel approach for sub-1B structured output generation.*