Instructions to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT",
	filename="mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00001-of-00018.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
# Run inference directly in the terminal:
llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
# Run inference directly in the terminal:
llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
# Run inference directly in the terminal:
./llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Use Docker

docker model run hf.co/Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

LM Studio
Jan
Ollama
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Ollama:
```
ollama run hf.co/Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
```

Unsloth Studio

How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT to start chatting

How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Docker Model Runner:
```
docker model run hf.co/Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R
```

Lemonade

How to use Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Thireus/mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT:Q6_K_R

Run and chat with the model

lemonade run user.mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT-Q6_K_R

List all available models

lemonade list

mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT / tensors.map

Thireus

mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_SPLIT

48a9967 28 days ago

Raw

History Blame Contribute Delete

3.62 kB

	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00002-of-00018.gguf:b346f7f3781c70b8ecc826e03a938d80d9b2e49e509ff2c1efcd9cb2a2ab3d73:token_embd.weight:shape=(2048, 248320):dtype=q6_k:elements=508559360:bytes=417177600
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00003-of-00018.gguf:06549440da9fc9254f38548d9869fcbe964b9053dacc6224cb29c66905205da5:output_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00004-of-00018.gguf:c7707c6ec7b9cfdfb9d6b387967d3397b76a7e58570b1f020327344c36ad5c88:blk.24.nextn.eh_proj.weight:shape=(4096, 2048):dtype=q6_k_r4:elements=8388608:bytes=6881280
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00005-of-00018.gguf:f3fea2c96dda0d5cbf1b849591cd23022dc716d5a5783c30c2d587f62df37c54:blk.24.attn_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00006-of-00018.gguf:09228a093fa899ba27efe423fc1bf1e2e167212a5e55f6135d9ef5fadc891ce5:blk.24.ffn_down.weight:shape=(6144, 2048):dtype=q6_k_r4:elements=12582912:bytes=10321920
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00007-of-00018.gguf:fa63f72c56f6fc898fa6fdc943489e463460d2d56b9c9ac3341efc27490af674:blk.24.ffn_gate.weight:shape=(2048, 6144):dtype=q6_k_r4:elements=12582912:bytes=10321920
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00008-of-00018.gguf:d7d5bbda90e334de0336b6aed891ad16f9527523cc3b0df9c462ede543fd7123:blk.24.ffn_up.weight:shape=(2048, 6144):dtype=q6_k_r4:elements=12582912:bytes=10321920
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00009-of-00018.gguf:b54e6a9050294dcef4e9a8c0cc0d7b158c0506ef935729880138eeb9ab20f639:blk.24.post_attention_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00010-of-00018.gguf:ca0e000cdd9fbbc9b4d2ba5e1a3dafc0cdc3578f98c70e248e13ecc08e45f5ea:blk.24.attn_k_norm.weight:shape=(256,):dtype=f32:elements=256:bytes=1024
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00011-of-00018.gguf:6c81999d635df08e7b7c09e40a88a86553313f87a38f9bab4cfa863b02e33313:blk.24.attn_k.weight:shape=(2048, 512):dtype=q6_k_r4:elements=1048576:bytes=860160
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00012-of-00018.gguf:9cd5a41a7756ae4113c7367f575c83bfacf564da52a30c0e8d48e0a486187eb7:blk.24.attn_output.weight:shape=(2048, 2048):dtype=q6_k_r4:elements=4194304:bytes=3440640
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00013-of-00018.gguf:0aa49a1498dd5ea2f90b19dc98d967c5a678b8bd7d1b4b3002eb49b0c2fd930e:blk.24.attn_q_norm.weight:shape=(256,):dtype=f32:elements=256:bytes=1024
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00014-of-00018.gguf:f97669edac03a47c7c7c3fa0a08179bb2829c7dd86b4733e57baed65390b7273:blk.24.attn_q.weight:shape=(2048, 4096):dtype=q6_k_r4:elements=8388608:bytes=6881280
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00015-of-00018.gguf:e928e3d94196bd76472bbc00e96cae9afeba4defcfe95625ea1cd66d9aad9809:blk.24.attn_v.weight:shape=(2048, 512):dtype=q6_k_r4:elements=1048576:bytes=860160
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00016-of-00018.gguf:db9403bcfd37df35f3845bd5fbc7f15b259506768b2ec3a69ed0e12fa1ed4ac0:blk.24.nextn.shared_head_norm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00017-of-00018.gguf:1a97a5511789b080a55ada74138de8152103d76c5d0d85229852c276e1d83bd5:blk.24.nextn.enorm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192
	mtp-Qwen3.5-2B-THIREUS-Q6_K_R4-SPECIAL_TENSOR-00018-of-00018.gguf:f5b039629679fb9d7b9a4f018c48a3040a3293833da470a8baa0c3ffd33a5b3f:blk.24.nextn.hnorm.weight:shape=(2048,):dtype=f32:elements=2048:bytes=8192