Image-Text-to-Text
Safetensors
English
Chinese
Korean
unsloth
qwen
qwen3.5
reasoning
chain-of-thought
lora
competitive-programming
Instructions to use Jackrong/Qwopus3.5-27B-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use Jackrong/Qwopus3.5-27B-v3 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jackrong/Qwopus3.5-27B-v3 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jackrong/Qwopus3.5-27B-v3 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Jackrong/Qwopus3.5-27B-v3 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Jackrong/Qwopus3.5-27B-v3", max_seq_length=2048, )
Benchmarked on Apple Silicon (M3 Ultra) β 39 tok/s, 100% tool calling via Rapid-MLX
#19
by Raullen - opened
Benchmark Results
Ran Qwopus 3.5-27B-v3 (4bit MLX) on Mac Studio M3 Ultra (256GB) using Rapid-MLX, an OpenAI-compatible inference server optimized for Apple Silicon.
Speed
| Metric | Result |
|---|---|
| Decode speed | 39 tok/s |
| TTFT (cold) | 0.44s |
| TTFT (cached) | 0.28s |
| Peak RAM | 14.8 GB |
Capabilities
| Test | Result |
|---|---|
| Tool calling (10 scenarios) | 10/10 (100%) |
| Tool call recovery (multi-round) | 2/2 (100%) |
| Think-tag leak rate | 0/5 (0%) |
| Vision | Supported |
Agentic Framework Integration
Tested with real client libraries β all connecting to Rapid-MLX serving Qwopus:
| Framework | Score |
|---|---|
| PydanticAI | 6/6 (plain, stream, structured, tool, multi-turn, multi-tool) |
| LangChain | 6/6 (plain, system, stream, tool, multi-tool, structured) |
| Anthropic SDK | 4/5 |
| smolagents | 3/4 |
| Claw Code | 3/3 (prompt, code gen, tool calling) |
How to run
pip install rapid-mlx
rapid-mlx serve Jackrong/MLX-Qwopus3.5-27B-v3-4bit
Then point any OpenAI-compatible app at http://localhost:8000/v1.
Rapid-MLX auto-detects Qwopus and configures the correct parsers (hermes tool parser + qwen3 reasoning parser) β no manual flags needed.
Great model! The Claude Opus reasoning distillation really shows in structured output and tool calling quality.