Text Generation
Transformers
GGUF
qwen3_5_moe
qwen3_5
reasoning
agentic-coding
mtp
apex
quantization
multimodal
Instructions to use SC117/Ornith-1.0-35B-MTP-APEX-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SC117/Ornith-1.0-35B-MTP-APEX-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SC117/Ornith-1.0-35B-MTP-APEX-GGUF")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("SC117/Ornith-1.0-35B-MTP-APEX-GGUF", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use SC117/Ornith-1.0-35B-MTP-APEX-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SC117/Ornith-1.0-35B-MTP-APEX-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SC117/Ornith-1.0-35B-MTP-APEX-GGUF", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/SC117/Ornith-1.0-35B-MTP-APEX-GGUF
- SGLang
How to use SC117/Ornith-1.0-35B-MTP-APEX-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SC117/Ornith-1.0-35B-MTP-APEX-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SC117/Ornith-1.0-35B-MTP-APEX-GGUF", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SC117/Ornith-1.0-35B-MTP-APEX-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SC117/Ornith-1.0-35B-MTP-APEX-GGUF", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use SC117/Ornith-1.0-35B-MTP-APEX-GGUF with Docker Model Runner:
docker model run hf.co/SC117/Ornith-1.0-35B-MTP-APEX-GGUF
File size: 16,276 Bytes
a5acd7e 67078d8 a5acd7e 228ce5f aa79a3f 67078d8 a5acd7e c05822e 67078d8 c05822e 67078d8 a5acd7e c05822e a5acd7e 67078d8 a5acd7e c05822e a5acd7e c05822e a5acd7e 67078d8 a5acd7e c05822e a5acd7e 67078d8 aa79a3f 67078d8 a5acd7e c05822e a5acd7e c05822e a5acd7e 67078d8 a5acd7e 67078d8 a5acd7e 67078d8 a5acd7e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | ---
library_name: transformers
license: mit
license_link: https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B/blob/main/LICENSE
pipeline_tag: text-generation
tags:
- qwen3_5_moe
- qwen3_5
- reasoning
- agentic-coding
- mtp
- apex
- quantization
- gguf
- multimodal
base_model:
- deepreinforce-ai/Ornith-1.0-35B
---
<div style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; margin-bottom: 24px;">
<div style="background: #f5f5f7; border-radius: 20px; padding: 36px 32px; margin-bottom: 20px; text-align: center; position: relative; overflow: hidden;">
<div style="position: absolute; top: -30px; right: -30px; width: 120px; height: 120px; background: #dbeafe; border-radius: 50%;"></div>
<div style="position: absolute; bottom: -20px; left: 40px; width: 80px; height: 80px; background: #bfdbfe; border-radius: 50%;"></div>
<div style="position: absolute; top: 50%; left: -15px; width: 60px; height: 60px; background: #dbeafe; border-radius: 50%;"></div>
<div style="display: inline-flex; gap: 8px; margin-bottom: 16px; position: relative; z-index: 1;">
<span style="background: #007aff; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">APEX</span>
<span style="background: #af52de; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">MTP</span>
<span style="background: #30d158; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">多模态</span>
<span style="background: #ff9500; color: white; font-size: 11px; font-weight: 600; padding: 5px 14px; border-radius: 20px;">MIT</span>
</div>
<h1 style="margin: 0 0 8px 0; font-size: 32px; font-weight: 700; color: #1d1d1f; letter-spacing: -0.5px; border: none; position: relative; z-index: 1;">Ornith-1.0-35B-MTP-APEX</h1>
<p style="margin: 8px 0 0 0; font-size: 14px; position: relative; z-index: 1;"><a href="https://huggingface.co/SC117/Ornith-1.0-35B-MTP-APEX-GGUF/blob/main/README.md" style="color: #007aff; text-decoration: none;">📖 English</a> | <span style="color: #86868b;">中文文档</span></p>
<p style="margin: 0; font-size: 15px; color: #86868b; position: relative; z-index: 1;">自改进 agentic coding 推理模型 · APEX 量化 GGUF + BF16 + mmproj</p>
</div>
</div>
<div style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; display: flex; flex-direction: column; gap: 20px; margin-bottom: 30px;">
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🐦</span> 关于 Ornith</div>
<div style="padding: 16px; font-size: 13px; color: #334155; line-height: 1.7;">
<p style="margin: 0 0 12px 0;"><a href="https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">Ornith-1.0-35B</a> 是 <a href="https://deep-reinforce.com/ornith.html" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">DeepReinforce AI</a> 推出的自改进 agentic coding 推理模型,基于 Qwen3.5 后训练,采用 RL 联合优化 scaffold 生成和解决方案 rollout。</p>
<p style="margin: 0 0 12px 0;">在 Terminal-Bench 2.1、SWE-Bench Verified/Pro/Multilingual、NL2Repo、OpenClaw 等编码基准上达到同参数量开源模型 SOTA。</p>
<p style="margin: 0;">本 GGUF 包含 <b>mmproj-F16.gguf</b> 视觉投影器,可配合 llama.cpp 实现多模态(图像+文本)功能。MTP 层来源于 Qwen3.5-35B-A3B(同架构,权重兼容)。<b>许可证:MIT。</b></p>
</div>
</div>
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🧠</span> 模型详情</div>
<div style="padding: 16px;">
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><tbody>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">架构</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white;">Qwen3.5 MoE(混合专家)</td></tr>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">参数量</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white;">总计 35B,每 token 激活 3B</td></tr>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">专家</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white;">256 路由专家,每 token 激活 8 个</td></tr>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">层数</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white;">40 transformer 层 + 1 MTP 层</td></tr>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">上下文</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white;">262,144 tokens</td></tr>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">MTP</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white;">1 个 MTP 层(785 tensors),来自 Qwen3.5-35B-A3B</td></tr>
<tr><td style="padding: 6px 10px; font-weight: bold; color: #334155; background: white;">许可证</td><td style="padding: 6px 10px; color: #4b5563; background: white;">MIT</td></tr>
</tbody></table>
</div>
</div>
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>📊</span> BenchLocal 测试成绩(APEX-I-Compact, 15.85 GB)</div>
<div style="padding: 16px;">
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);"><th style="padding: 7px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">模式</th><th style="padding: 7px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">ToolCall-15</th><th style="padding: 7px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">BugFind-15</th><th style="padding: 7px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">HermesAgent-20</th><th style="padding: 7px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">能力上限</th><th style="padding: 7px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold;">实用得分</th></tr></thead><tbody>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">思考</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">100</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">93</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">89</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #c2410c; background: white;">93.5</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">75.5</td></tr>
<tr><td style="padding: 6px 10px; font-weight: bold; color: #334155; background: white;">无思考</td><td style="padding: 6px 10px; color: #334155; background: white;">100</td><td style="padding: 6px 10px; color: #334155; background: white;">92</td><td style="padding: 6px 10px; color: #334155; background: white;">89</td><td style="padding: 6px 10px; font-weight: bold; color: #c2410c; background: white;">93.2</td><td style="padding: 6px 10px; font-weight: bold; color: #30d158; background: white;">85.2</td></tr>
</tbody></table>
<p style="margin: 12px 0 0 0; font-size: 12px; color: #64748b; font-style: italic;">RTX 5070 Ti · 无思考模式实际可靠性更优(重试更少)。</p>
</div>
</div>
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🚀</span> 使用方法</div>
<div style="padding: 16px; font-size: 13px; color: #334155; line-height: 1.7;">
<p style="margin: 0 0 8px 0; font-weight: bold; color: #1e293b;">llama.cpp(纯文本)</p>
<p style="margin: 0; font-family: monospace; background: #f8fafc; padding: 10px 14px; border-radius: 6px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b; white-space: pre-wrap;">hf download SC117/Ornith-1.0-35B-MTP-APEX-GGUF --include "*.gguf" --local-dir ./models
./llama-server -m ./models/Ornith-1.0-35B-MTP-APEX-I-Compact.gguf -ngl 99 -c 131072</p>
<p style="margin: 12px 0 8px 0; font-weight: bold; color: #1e293b;">llama.cpp(视觉 + 文本)</p>
<p style="margin: 0; font-family: monospace; background: #f8fafc; padding: 10px 14px; border-radius: 6px; border: 1px solid #e2e8f0; font-size: 12px; color: #1e293b; white-space: pre-wrap;">./llama-server -m ./models/Ornith-1.0-35B-MTP-APEX-I-Compact.gguf --mmproj ./models/mmproj-F16.gguf -ngl 99 -c 131072</p>
</div>
</div>
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>🎛️</span> 推荐参数</div>
<div style="padding: 16px;">
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);"><th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold; width: 30%;">模式</th><th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold; width: 70%;">参数</th></tr></thead><tbody>
<tr><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); font-weight: bold; color: #334155; background: white;">通用</td><td style="padding: 6px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #4b5563; background: white; font-family: monospace;">temperature=0.6, top_p=0.95, top_k=20</td></tr>
<tr><td style="padding: 6px 10px; font-weight: bold; color: #334155; background: white;">编程</td><td style="padding: 6px 10px; color: #4b5563; font-family: monospace; background: white;">temperature=0.6, top_p=0.95, top_k=20</td></tr>
</tbody></table>
</div>
</div>
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>💡</span> 什么是 APEX?</div>
<div style="padding: 16px;">
<p style="margin: 0 0 12px 0; font-size: 13px; color: #334155; line-height: 1.7;">这些 GGUF 文件使用 <a href="https://github.com/mudler/apex-quant" target="_blank" style="color: #c2410c; text-decoration: none; font-weight: 700;">APEX</a> 量化,这是一种 MoE 感知的混合精度量化技术。APEX 按 tensor 角色分类——路由专家、共享专家、注意力层——并应用逐层精度梯度,对最敏感的边缘层赋予更高精度,对冗余的中间层进行更激进的压缩。</p>
<p style="margin: 0; font-size: 13px; color: #334155; line-height: 1.7; font-weight: 700;">APEX 以一半的体积达到 Q8_0 的困惑度——甚至超越 F16。</p>
</div>
</div>
<div style="border: 1px solid #cbd5e1; border-radius: 12px; overflow: hidden; background: #ffffff; box-shadow: 0 2px 4px rgba(0,0,0,0.02);">
<div style="background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); padding: 12px 16px; color: white; font-weight: 700; font-size: 14px; display: flex; align-items: center; gap: 8px;"><span>📦</span> APEX 量化档位</div>
<div style="padding: 16px;">
<table style="width: 100%; border-collapse: collapse; font-size: 13px;"><thead><tr style="background: rgba(255,107,53,0.05);"><th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold; width: 35%;">文件</th><th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold; width: 15%;">大小</th><th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold; width: 15%;">档位</th><th style="padding: 8px 10px; border-bottom: 2px solid #ff6b35; text-align: left; color: #c2410c; font-weight: bold; width: 35%;">适用场景</th></tr></thead><tbody>
<tr><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;"><code>*-APEX-I-Quality.gguf</code></td><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">21.90 GB</td><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">I-Quality</td><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">最高质量,最佳精度</td></tr>
<tr><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;"><code>*-APEX-I-Balanced.gguf</code></td><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">24.18 GB</td><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">I-Balanced</td><td style="padding: 8px 10px; box-shadow: 0 1px 0 0 rgba(128,128,128,0.15); color: #334155; background: white;">均衡之选,推荐使用</td></tr>
<tr><td style="padding: 8px 10px; color: #334155;"><code>*-APEX-I-Compact.gguf</code></td><td style="padding: 8px 10px; color: #334155;">15.85 GB</td><td style="padding: 8px 10px; color: #334155;">I-Compact</td><td style="padding: 8px 10px; color: #334155;">最佳质量/体积比</td></tr>
</tbody></table>
</div>
</div>
</div>
## 链接
- **原始模型**: https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B
- **Ornith 博客**: https://deep-reinforce.com/ornith.html
- **APEX 量化**: https://github.com/mudler/apex-quant
- **BenchLocal 测试结果**: https://scorp1o117.github.io/benchlocal-results/
## 引用
```bibtex
@misc{ornith-35b,
title = {{Ornith-1.0-35B}: Agentic Coding, Open to All},
url = {https://deep-reinforce.com/ornith_1_0.html},
author = {{DeepReinforce Team}},
year = {2026}
}
```
|