Instructions to use deepreinforce-ai/Ornith-1.0-35B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deepreinforce-ai/Ornith-1.0-35B-GGUF") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("deepreinforce-ai/Ornith-1.0-35B-GGUF", dtype="auto") - llama-cpp-python
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="deepreinforce-ai/Ornith-1.0-35B-GGUF", filename="ornith-1.0-35b-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepreinforce-ai/Ornith-1.0-35B-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepreinforce-ai/Ornith-1.0-35B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
- SGLang
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepreinforce-ai/Ornith-1.0-35B-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepreinforce-ai/Ornith-1.0-35B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepreinforce-ai/Ornith-1.0-35B-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepreinforce-ai/Ornith-1.0-35B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Ollama:
ollama run hf.co/deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
- Unsloth Studio
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for deepreinforce-ai/Ornith-1.0-35B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for deepreinforce-ai/Ornith-1.0-35B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for deepreinforce-ai/Ornith-1.0-35B-GGUF to start chatting
- Pi
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Docker Model Runner:
docker model run hf.co/deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
- Lemonade
How to use deepreinforce-ai/Ornith-1.0-35B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull deepreinforce-ai/Ornith-1.0-35B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Ornith-1.0-35B-GGUF-Q4_K_M
List all available models
lemonade list
Update README.md
Browse files
README.md
CHANGED
|
@@ -311,47 +311,122 @@ Ornith-1.0-35B excels in tool-calling and agentic coding capabilities.
|
|
| 311 |
Because Ornith-1.0-35B exposes an OpenAI-compatible endpoint with tool calling, it works out of the box with standard agent frameworks. Below is a minimal example that connects Ornith-1.0-35B to tools through an MCP server.
|
| 312 |
|
| 313 |
```python
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 314 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 315 |
```
|
| 316 |
|
| 317 |
**Examples of using Ornith with agent harness:**
|
| 318 |
|
| 319 |
#### Hermes Agent
|
| 320 |
```bash
|
| 321 |
-
|
|
|
|
|
|
|
|
|
|
| 322 |
```
|
| 323 |
|
| 324 |
-
|
|
|
|
| 325 |
```bash
|
|
|
|
|
|
|
|
|
|
|
|
|
| 326 |
|
|
|
|
|
|
|
| 327 |
```
|
| 328 |
|
| 329 |
-
####
|
| 330 |
-
```bash
|
| 331 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 332 |
```
|
| 333 |
|
| 334 |
#### Unsloth Studio
|
| 335 |
|
| 336 |
```bash
|
| 337 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 338 |
```
|
| 339 |
|
| 340 |
-
####
|
| 341 |
-
|
| 342 |
```bash
|
|
|
|
| 343 |
|
| 344 |
-
|
|
|
|
|
|
|
|
|
|
| 345 |
|
|
|
|
|
|
|
|
|
|
| 346 |
|
| 347 |
### Coding CLIs
|
| 348 |
|
| 349 |
Ornith-1.0-35B is optimized for terminal-based coding agents. Point any OpenAI-compatible coding CLI at your Ornith-1.0-35B endpoint (set `OPENAI_BASE_URL` and `OPENAI_API_KEY`) to understand large codebases, automate tedious work, and ship faster.
|
| 350 |
|
| 351 |
-
|
| 352 |
#### OpenCode
|
| 353 |
```bash
|
| 354 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 355 |
```
|
| 356 |
|
| 357 |
|
|
|
|
| 311 |
Because Ornith-1.0-35B exposes an OpenAI-compatible endpoint with tool calling, it works out of the box with standard agent frameworks. Below is a minimal example that connects Ornith-1.0-35B to tools through an MCP server.
|
| 312 |
|
| 313 |
```python
|
| 314 |
+
import os
|
| 315 |
+
from openai import OpenAI
|
| 316 |
+
|
| 317 |
+
client = OpenAI(
|
| 318 |
+
base_url=os.getenv("OPENAI_BASE_URL", "http://localhost:8000/v1"),
|
| 319 |
+
api_key=os.getenv("OPENAI_API_KEY", "EMPTY"),
|
| 320 |
+
)
|
| 321 |
+
|
| 322 |
+
tools = [
|
| 323 |
+
{
|
| 324 |
+
"type": "function",
|
| 325 |
+
"function": {
|
| 326 |
+
"name": "run_shell",
|
| 327 |
+
"description": "Run a shell command and return its output.",
|
| 328 |
+
"parameters": {
|
| 329 |
+
"type": "object",
|
| 330 |
+
"properties": {
|
| 331 |
+
"command": {"type": "string", "description": "The command to run"}
|
| 332 |
+
},
|
| 333 |
+
"required": ["command"],
|
| 334 |
+
},
|
| 335 |
+
},
|
| 336 |
+
}
|
| 337 |
+
]
|
| 338 |
+
|
| 339 |
+
messages = [{"role": "user", "content": "List the Python files in the current directory."}]
|
| 340 |
|
| 341 |
+
response = client.chat.completions.create(
|
| 342 |
+
model="deepreinforce-ai/Ornith-1.0-35B",
|
| 343 |
+
messages=messages,
|
| 344 |
+
tools=tools,
|
| 345 |
+
temperature=0.6,
|
| 346 |
+
top_p=0.95,
|
| 347 |
+
)
|
| 348 |
+
print(response.choices[0].message)
|
| 349 |
```
|
| 350 |
|
| 351 |
**Examples of using Ornith with agent harness:**
|
| 352 |
|
| 353 |
#### Hermes Agent
|
| 354 |
```bash
|
| 355 |
+
# Hermes talks to any OpenAI-compatible endpoint — point it at your Ornith server.
|
| 356 |
+
export OPENAI_BASE_URL="http://localhost:8000/v1"
|
| 357 |
+
export OPENAI_API_KEY="EMPTY"
|
| 358 |
+
export MODEL="deepreinforce-ai/Ornith-1.0-35B"
|
| 359 |
```
|
| 360 |
|
| 361 |
+
|
| 362 |
+
#### Atomic.chat/ Ollama / llama.cpp
|
| 363 |
```bash
|
| 364 |
+
# Both runtimes load a GGUF build of Ornith (publish one at deepreinforce-ai/Ornith-1.0-35B-GGUF).
|
| 365 |
+
|
| 366 |
+
# llama.cpp — serve an OpenAI-compatible API on port 8000.
|
| 367 |
+
llama-server -hf deepreinforce-ai/Ornith-1.0-35B-GGUF --port 8000 -c 262144
|
| 368 |
|
| 369 |
+
# Ollama — pull and chat with the same GGUF straight from Hugging Face.
|
| 370 |
+
ollama run hf.co/deepreinforce-ai/Ornith-1.0-35B-GGUF
|
| 371 |
```
|
| 372 |
|
| 373 |
+
#### OpenClaw
|
|
|
|
| 374 |
|
| 375 |
+
```bash
|
| 376 |
+
# OpenClaw talks to any OpenAI-compatible endpoint — point it at your Ornith server.
|
| 377 |
+
export OPENAI_BASE_URL="http://localhost:8000/v1"
|
| 378 |
+
export OPENAI_API_KEY="EMPTY"
|
| 379 |
+
export OPENAI_MODEL="deepreinforce-ai/Ornith-1.0-35B"
|
| 380 |
```
|
| 381 |
|
| 382 |
#### Unsloth Studio
|
| 383 |
|
| 384 |
```bash
|
| 385 |
+
pip install unsloth
|
| 386 |
+
|
| 387 |
+
# Load Ornith for fast local inference or fine-tuning (Python):
|
| 388 |
+
# from unsloth import FastLanguageModel
|
| 389 |
+
# model, tokenizer = FastLanguageModel.from_pretrained(
|
| 390 |
+
# "deepreinforce-ai/Ornith-1.0-35B",
|
| 391 |
+
# max_seq_length=262144,
|
| 392 |
+
# load_in_4bit=True,
|
| 393 |
+
# )
|
| 394 |
```
|
| 395 |
|
| 396 |
+
#### OpenHands
|
|
|
|
| 397 |
```bash
|
| 398 |
+
pip install openhands-ai
|
| 399 |
|
| 400 |
+
# OpenHands routes through LiteLLM; the "openai/" prefix selects the OpenAI-compatible path.
|
| 401 |
+
export LLM_MODEL="openai/deepreinforce-ai/Ornith-1.0-35B"
|
| 402 |
+
export LLM_BASE_URL="http://localhost:8000/v1"
|
| 403 |
+
export LLM_API_KEY="EMPTY"
|
| 404 |
|
| 405 |
+
# Launch the CLI (or run the official OpenHands Docker image with the same env vars).
|
| 406 |
+
openhands
|
| 407 |
+
```
|
| 408 |
|
| 409 |
### Coding CLIs
|
| 410 |
|
| 411 |
Ornith-1.0-35B is optimized for terminal-based coding agents. Point any OpenAI-compatible coding CLI at your Ornith-1.0-35B endpoint (set `OPENAI_BASE_URL` and `OPENAI_API_KEY`) to understand large codebases, automate tedious work, and ship faster.
|
| 412 |
|
|
|
|
| 413 |
#### OpenCode
|
| 414 |
```bash
|
| 415 |
+
# Register your local Ornith endpoint as a provider in ~/.config/opencode/opencode.json:
|
| 416 |
+
#
|
| 417 |
+
# {
|
| 418 |
+
# "$schema": "https://opencode.ai/config.json",
|
| 419 |
+
# "provider": {
|
| 420 |
+
# "ornith": {
|
| 421 |
+
# "npm": "@ai-sdk/openai-compatible",
|
| 422 |
+
# "name": "Ornith (local)",
|
| 423 |
+
# "options": { "baseURL": "http://localhost:8000/v1", "apiKey": "EMPTY" },
|
| 424 |
+
# "models": { "deepreinforce-ai/Ornith-1.0-35B": { "name": "Ornith-1.0-35B" } }
|
| 425 |
+
# }
|
| 426 |
+
# }
|
| 427 |
+
# }
|
| 428 |
+
|
| 429 |
+
opencode
|
| 430 |
```
|
| 431 |
|
| 432 |
|