Instructions to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT", filename="mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00001-of-00018.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R # Run inference directly in the terminal: llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R # Run inference directly in the terminal: llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R # Run inference directly in the terminal: ./llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R # Run inference directly in the terminal: ./build/bin/llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Use Docker
docker model run hf.co/Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
- LM Studio
- Jan
- Ollama
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Ollama:
ollama run hf.co/Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
- Unsloth Studio
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT to start chatting
- Pi
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Docker Model Runner:
docker model run hf.co/Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
- Lemonade
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
Run and chat with the model
lemonade run user.mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT-Q6_K_R
List all available models
lemonade list
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00002-of-00018.gguf:b346f7f3781c70b8ecc826e03a938d80d9b2e49e509ff2c1efcd9cb2a2ab3d73:token_embd.weight:shape=(2048, 248320):dtype=q6_k:elements=508559360:bytes=417177600 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00003-of-00018.gguf:06549440da9fc9254f38548d9869fcbe964b9053dacc6224cb29c66905205da5:output_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00004-of-00018.gguf:c7707c6ec7b9cfdfb9d6b387967d3397b76a7e58570b1f020327344c36ad5c88:blk.24.nextn.eh_proj.weight:shape=(4096, 2048):dtype=q6_k_r4:elements=8388608:bytes=6881280 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00005-of-00018.gguf:f3fea2c96dda0d5cbf1b849591cd23022dc716d5a5783c30c2d587f62df37c54:blk.24.attn_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00006-of-00018.gguf:09228a093fa899ba27efe423fc1bf1e2e167212a5e55f6135d9ef5fadc891ce5:blk.24.ffn_down.weight:shape=(6144, 2048):dtype=q6_k_r4:elements=12582912:bytes=10321920 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00007-of-00018.gguf:fa63f72c56f6fc898fa6fdc943489e463460d2d56b9c9ac3341efc27490af674:blk.24.ffn_gate.weight:shape=(2048, 6144):dtype=q6_k_r4:elements=12582912:bytes=10321920 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00008-of-00018.gguf:d7d5bbda90e334de0336b6aed891ad16f9527523cc3b0df9c462ede543fd7123:blk.24.ffn_up.weight:shape=(2048, 6144):dtype=q6_k_r4:elements=12582912:bytes=10321920 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00009-of-00018.gguf:b54e6a9050294dcef4e9a8c0cc0d7b158c0506ef935729880138eeb9ab20f639:blk.24.post_attention_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00010-of-00018.gguf:ca0e000cdd9fbbc9b4d2ba5e1a3dafc0cdc3578f98c70e248e13ecc08e45f5ea:blk.24.attn_k_norm.weight:shape=(256,):dtype=f32:elements=256:bytes=1024 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00011-of-00018.gguf:6c81999d635df08e7b7c09e40a88a86553313f87a38f9bab4cfa863b02e33313:blk.24.attn_k.weight:shape=(2048, 512):dtype=q6_k_r4:elements=1048576:bytes=860160 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00012-of-00018.gguf:9cd5a41a7756ae4113c7367f575c83bfacf564da52a30c0e8d48e0a486187eb7:blk.24.attn_output.weight:shape=(2048, 2048):dtype=q6_k_r4:elements=4194304:bytes=3440640 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00013-of-00018.gguf:0aa49a1498dd5ea2f90b19dc98d967c5a678b8bd7d1b4b3002eb49b0c2fd930e:blk.24.attn_q_norm.weight:shape=(256,):dtype=f32:elements=256:bytes=1024 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00014-of-00018.gguf:f97669edac03a47c7c7c3fa0a08179bb2829c7dd86b4733e57baed65390b7273:blk.24.attn_q.weight:shape=(2048, 4096):dtype=q6_k_r4:elements=8388608:bytes=6881280 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00015-of-00018.gguf:e928e3d94196bd76472bbc00e96cae9afeba4defcfe95625ea1cd66d9aad9809:blk.24.attn_v.weight:shape=(2048, 512):dtype=q6_k_r4:elements=1048576:bytes=860160 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00016-of-00018.gguf:db9403bcfd37df35f3845bd5fbc7f15b259506768b2ec3a69ed0e12fa1ed4ac0:blk.24.nextn.shared_head_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00017-of-00018.gguf:1a97a5511789b080a55ada74138de8152103d76c5d0d85229852c276e1d83bd5:blk.24.nextn.enorm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192 | |
| mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00018-of-00018.gguf:f5b039629679fb9d7b9a4f018c48a3040a3293833da470a8baa0c3ffd33a5b3f:blk.24.nextn.hnorm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192 | |