How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for heath0xFF/VibeThinker-3B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for heath0xFF/VibeThinker-3B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for heath0xFF/VibeThinker-3B-GGUF to start chatting
Quick Links

VibeThinker-3B GGUF

GGUF quantizations of WeiboAI/VibeThinker-3B, a Qwen2-based 3B parameter thinking model with 131K context.

Converted with llama.cpp convert_hf_to_gguf.py.

Available Quantizations

File Size BPW Description
VibeThinker-3B-F16.gguf 5.8 GB 16.00 Full FP16 (reference)
VibeThinker-3B-Q8_0.gguf 3.1 GB 8.50 Near-lossless 8-bit
VibeThinker-3B-Q5_K_M.gguf 2.1 GB 5.75 High quality 5-bit
VibeThinker-3B-Q4_K_M.gguf 1.8 GB 4.99 Great size/quality tradeoff

Usage

llama.cpp

./llama-cli -m VibeThinker-3B-Q4_K_M.gguf -p "Hello!" -n 128

Chat Format

This model uses the Qwen2 chat format with thinking tags:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Hello!<|im_end|>
<|im_start|>assistant
<think>...reasoning...</think>
...response...
<|im_end|>

Model Details

  • Architecture: Qwen2ForCausalLM
  • Parameters: ~3B
  • Layers: 36
  • Hidden size: 2048
  • Heads: 16 (2 KV heads)
  • Context: 131,072 tokens
  • Vocab: 151,936
Downloads last month
399
GGUF
Model size
3B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for heath0xFF/VibeThinker-3B-GGUF

Base model

Qwen/Qwen2.5-3B
Quantized
(49)
this model