Instructions to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF") model = AutoModelForMultimodalLM.from_pretrained("TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF") - llama-cpp-python
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF", filename="Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-bf16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Use Docker
docker model run hf.co/TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
- SGLang
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Ollama:
ollama run hf.co/TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
- Unsloth Studio
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF to start chatting
- Pi
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Docker Model Runner:
docker model run hf.co/TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
- Lemonade
How to use TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF-Q4_K_M
List all available models
lemonade list
Configure Hermes
# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF:Run Hermes
hermesQwen3 4B Thinking 2507 x Gemini 2.5 Flash
This model was trained on a large Gemini 2.5 Flash dataset.
The goal of was to distill the behavior, reasoning traces, output style, and (most importantly) knowledge of Gemini-2.5 Flash.
🤖 Related Models:
Model Effective parameters Active parameters TeichAI/Qwen3-30B-A3B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF30 B 3 B TeichAI/Qwen3-8B-Gemini-2.5-Flash-Distill-GGUF8 B 8 B 🧬 Datasets:
TeichAI/gemini-2.5-flash-11000x
🏗 Base Model:
unsloth/Qwen3-30B-A3B-Thinking-2507
⚡ Use cases:
- Coding
- Science
- Legal
- History
- Marketing
- General Purpose
∑ Stats (Dataset)
- Costs: $ 134 (USD)
- Total tokens (input + output): 54.4 M
Benchmark Results
Model Comparison vs Base
- Base model: unsloth/Qwen3-4B-Thinking-2507
| Compare Model | Benchmark | Base Score | Model Score | Delta | Delta % |
|---|---|---|---|---|---|
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | arc_challenge | 0.486348 | 0.511945 | 0.0255973 | 0.0526316 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | gpqa_diamond_zeroshot | 0.30303 | 0.353535 | 0.0505051 | 0.166667 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | hellaswag | 0.479785 | 0.504382 | 0.0245967 | 0.0512661 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | mmlu | 0.65532 | 0.661587 | 0.00626691 | 0.00956314 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | truthfulqa_mc2 | 0.555747 | 0.552899 | -0.00284708 | -0.00512299 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | winogrande | 0.64562 | 0.65588 | 0.0102605 | 0.0158924 |
Aggregate Comparison
| Compare Model | Benchmarks Compared | Wins vs Base | Ties vs Base | Losses vs Base | Avg Delta |
|---|---|---|---|---|---|
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | 6 | 5 | 0 | 1 | 0.0190632 |
Detailed Results
| Model | Benchmark | Score | Total Questions | Total Correct |
|---|---|---|---|---|
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | arc_challenge | 0.511945 | 1172 | 600 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | gpqa_diamond_zeroshot | 0.353535 | 198 | 70 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | hellaswag | 0.504382 | 10042 | 5065 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | mmlu | 0.661587 | 14042 | 9290 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | truthfulqa_mc2 | 0.552899 | 817 | 451 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | winogrande | 0.65588 | 1267 | 831 |
| unsloth/Qwen3-4B-Thinking-2507 | arc_challenge | 0.486348 | 1172 | 570 |
| unsloth/Qwen3-4B-Thinking-2507 | gpqa_diamond_zeroshot | 0.30303 | 198 | 60 |
| unsloth/Qwen3-4B-Thinking-2507 | hellaswag | 0.479785 | 10042 | 4818 |
| unsloth/Qwen3-4B-Thinking-2507 | mmlu | 0.65532 | 14042 | 9202 |
| unsloth/Qwen3-4B-Thinking-2507 | truthfulqa_mc2 | 0.555747 | 817 | 454 |
| unsloth/Qwen3-4B-Thinking-2507 | winogrande | 0.64562 | 1267 | 818 |
MMLU Subject Breakdown
| Model | Subject | Benchmark | Score | Total Questions | Total Correct |
|---|---|---|---|---|---|
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | formal_logic | mmlu_formal_logic | 0.603175 | 126 | 76 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_european_history | mmlu_high_school_european_history | 0.727273 | 165 | 120 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_us_history | mmlu_high_school_us_history | 0.833333 | 204 | 170 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_world_history | mmlu_high_school_world_history | 0.801688 | 237 | 190 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | international_law | mmlu_international_law | 0.752066 | 121 | 91 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | jurisprudence | mmlu_jurisprudence | 0.777778 | 108 | 84 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | logical_fallacies | mmlu_logical_fallacies | 0.797546 | 163 | 130 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | moral_disputes | mmlu_moral_disputes | 0.67341 | 346 | 233 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | moral_scenarios | mmlu_moral_scenarios | 0.340782 | 895 | 305 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | philosophy | mmlu_philosophy | 0.694534 | 311 | 216 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | prehistory | mmlu_prehistory | 0.719136 | 324 | 233 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | professional_law | mmlu_professional_law | 0.45176 | 1534 | 693 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | world_religions | mmlu_world_religions | 0.754386 | 171 | 129 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | business_ethics | mmlu_business_ethics | 0.71 | 100 | 71 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | clinical_knowledge | mmlu_clinical_knowledge | 0.70566 | 265 | 187 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | college_medicine | mmlu_college_medicine | 0.693642 | 173 | 120 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | global_facts | mmlu_global_facts | 0.44 | 100 | 44 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | human_aging | mmlu_human_aging | 0.695067 | 223 | 155 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | management | mmlu_management | 0.854369 | 103 | 88 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | marketing | mmlu_marketing | 0.846154 | 234 | 198 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | medical_genetics | mmlu_medical_genetics | 0.77 | 100 | 77 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | miscellaneous | mmlu_miscellaneous | 0.779055 | 783 | 610 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | nutrition | mmlu_nutrition | 0.718954 | 306 | 220 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | professional_accounting | mmlu_professional_accounting | 0.528369 | 282 | 149 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | professional_medicine | mmlu_professional_medicine | 0.672794 | 272 | 183 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | virology | mmlu_virology | 0.524096 | 166 | 87 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | econometrics | mmlu_econometrics | 0.605263 | 114 | 69 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_geography | mmlu_high_school_geography | 0.823232 | 198 | 163 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_government_and_politics | mmlu_high_school_government_and_politics | 0.865285 | 193 | 167 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_macroeconomics | mmlu_high_school_macroeconomics | 0.720513 | 390 | 281 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_microeconomics | mmlu_high_school_microeconomics | 0.802521 | 238 | 191 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_psychology | mmlu_high_school_psychology | 0.86789 | 545 | 473 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | human_sexuality | mmlu_human_sexuality | 0.717557 | 131 | 94 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | professional_psychology | mmlu_professional_psychology | 0.658497 | 612 | 403 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | public_relations | mmlu_public_relations | 0.690909 | 110 | 76 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | security_studies | mmlu_security_studies | 0.726531 | 245 | 178 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | sociology | mmlu_sociology | 0.80597 | 201 | 162 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | us_foreign_policy | mmlu_us_foreign_policy | 0.82 | 100 | 82 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | abstract_algebra | mmlu_abstract_algebra | 0.57 | 100 | 56 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | anatomy | mmlu_anatomy | 0.6 | 135 | 81 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | astronomy | mmlu_astronomy | 0.809211 | 152 | 123 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | college_biology | mmlu_college_biology | 0.8125 | 144 | 117 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | college_chemistry | mmlu_college_chemistry | 0.51 | 100 | 51 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | college_computer_science | mmlu_college_computer_science | 0.57 | 100 | 56 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | college_mathematics | mmlu_college_mathematics | 0.53 | 100 | 53 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | college_physics | mmlu_college_physics | 0.509804 | 102 | 52 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | computer_security | mmlu_computer_security | 0.78 | 100 | 78 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | conceptual_physics | mmlu_conceptual_physics | 0.753191 | 235 | 177 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | electrical_engineering | mmlu_electrical_engineering | 0.724138 | 145 | 105 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | elementary_mathematics | mmlu_elementary_mathematics | 0.642857 | 378 | 243 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_biology | mmlu_high_school_biology | 0.835484 | 310 | 259 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_chemistry | mmlu_high_school_chemistry | 0.669951 | 203 | 136 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_computer_science | mmlu_high_school_computer_science | 0.82 | 100 | 82 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_mathematics | mmlu_high_school_mathematics | 0.477778 | 270 | 129 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_physics | mmlu_high_school_physics | 0.589404 | 151 | 89 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | high_school_statistics | mmlu_high_school_statistics | 0.703704 | 216 | 152 |
| TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill | machine_learning | mmlu_machine_learning | 0.455357 | 112 | 51 |
| unsloth/Qwen3-4B-Thinking-2507 | formal_logic | mmlu_formal_logic | 0.595238 | 126 | 75 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_european_history | mmlu_high_school_european_history | 0.727273 | 165 | 120 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_us_history | mmlu_high_school_us_history | 0.818627 | 204 | 167 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_world_history | mmlu_high_school_world_history | 0.797468 | 237 | 189 |
| unsloth/Qwen3-4B-Thinking-2507 | international_law | mmlu_international_law | 0.727273 | 121 | 88 |
| unsloth/Qwen3-4B-Thinking-2507 | jurisprudence | mmlu_jurisprudence | 0.777778 | 108 | 84 |
| unsloth/Qwen3-4B-Thinking-2507 | logical_fallacies | mmlu_logical_fallacies | 0.754601 | 163 | 123 |
| unsloth/Qwen3-4B-Thinking-2507 | moral_disputes | mmlu_moral_disputes | 0.67341 | 346 | 233 |
| unsloth/Qwen3-4B-Thinking-2507 | moral_scenarios | mmlu_moral_scenarios | 0.372067 | 895 | 333 |
| unsloth/Qwen3-4B-Thinking-2507 | philosophy | mmlu_philosophy | 0.672026 | 311 | 208 |
| unsloth/Qwen3-4B-Thinking-2507 | prehistory | mmlu_prehistory | 0.70679 | 324 | 229 |
| unsloth/Qwen3-4B-Thinking-2507 | professional_law | mmlu_professional_law | 0.441982 | 1534 | 678 |
| unsloth/Qwen3-4B-Thinking-2507 | world_religions | mmlu_world_religions | 0.777778 | 171 | 133 |
| unsloth/Qwen3-4B-Thinking-2507 | business_ethics | mmlu_business_ethics | 0.63 | 100 | 63 |
| unsloth/Qwen3-4B-Thinking-2507 | clinical_knowledge | mmlu_clinical_knowledge | 0.716981 | 265 | 190 |
| unsloth/Qwen3-4B-Thinking-2507 | college_medicine | mmlu_college_medicine | 0.693642 | 173 | 120 |
| unsloth/Qwen3-4B-Thinking-2507 | global_facts | mmlu_global_facts | 0.36 | 100 | 36 |
| unsloth/Qwen3-4B-Thinking-2507 | human_aging | mmlu_human_aging | 0.713004 | 223 | 159 |
| unsloth/Qwen3-4B-Thinking-2507 | management | mmlu_management | 0.864078 | 103 | 89 |
| unsloth/Qwen3-4B-Thinking-2507 | marketing | mmlu_marketing | 0.854701 | 234 | 200 |
| unsloth/Qwen3-4B-Thinking-2507 | medical_genetics | mmlu_medical_genetics | 0.8 | 100 | 80 |
| unsloth/Qwen3-4B-Thinking-2507 | miscellaneous | mmlu_miscellaneous | 0.776501 | 783 | 608 |
| unsloth/Qwen3-4B-Thinking-2507 | nutrition | mmlu_nutrition | 0.712418 | 306 | 218 |
| unsloth/Qwen3-4B-Thinking-2507 | professional_accounting | mmlu_professional_accounting | 0.549645 | 282 | 155 |
| unsloth/Qwen3-4B-Thinking-2507 | professional_medicine | mmlu_professional_medicine | 0.669118 | 272 | 182 |
| unsloth/Qwen3-4B-Thinking-2507 | virology | mmlu_virology | 0.518072 | 166 | 86 |
| unsloth/Qwen3-4B-Thinking-2507 | econometrics | mmlu_econometrics | 0.614035 | 114 | 70 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_geography | mmlu_high_school_geography | 0.777778 | 198 | 154 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_government_and_politics | mmlu_high_school_government_and_politics | 0.829016 | 193 | 160 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_macroeconomics | mmlu_high_school_macroeconomics | 0.720513 | 390 | 281 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_microeconomics | mmlu_high_school_microeconomics | 0.802521 | 238 | 191 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_psychology | mmlu_high_school_psychology | 0.851376 | 545 | 464 |
| unsloth/Qwen3-4B-Thinking-2507 | human_sexuality | mmlu_human_sexuality | 0.740458 | 131 | 97 |
| unsloth/Qwen3-4B-Thinking-2507 | professional_psychology | mmlu_professional_psychology | 0.669935 | 612 | 409 |
| unsloth/Qwen3-4B-Thinking-2507 | public_relations | mmlu_public_relations | 0.672727 | 110 | 74 |
| unsloth/Qwen3-4B-Thinking-2507 | security_studies | mmlu_security_studies | 0.706122 | 245 | 173 |
| unsloth/Qwen3-4B-Thinking-2507 | sociology | mmlu_sociology | 0.800995 | 201 | 161 |
| unsloth/Qwen3-4B-Thinking-2507 | us_foreign_policy | mmlu_us_foreign_policy | 0.82 | 100 | 82 |
| unsloth/Qwen3-4B-Thinking-2507 | abstract_algebra | mmlu_abstract_algebra | 0.55 | 100 | 55 |
| unsloth/Qwen3-4B-Thinking-2507 | anatomy | mmlu_anatomy | 0.644444 | 135 | 87 |
| unsloth/Qwen3-4B-Thinking-2507 | astronomy | mmlu_astronomy | 0.782895 | 152 | 119 |
| unsloth/Qwen3-4B-Thinking-2507 | college_biology | mmlu_college_biology | 0.763889 | 144 | 110 |
| unsloth/Qwen3-4B-Thinking-2507 | college_chemistry | mmlu_college_chemistry | 0.55 | 100 | 55 |
| unsloth/Qwen3-4B-Thinking-2507 | college_computer_science | mmlu_college_computer_science | 0.62 | 100 | 62 |
| unsloth/Qwen3-4B-Thinking-2507 | college_mathematics | mmlu_college_mathematics | 0.44 | 100 | 44 |
| unsloth/Qwen3-4B-Thinking-2507 | college_physics | mmlu_college_physics | 0.480392 | 102 | 49 |
| unsloth/Qwen3-4B-Thinking-2507 | computer_security | mmlu_computer_security | 0.73 | 100 | 73 |
| unsloth/Qwen3-4B-Thinking-2507 | conceptual_physics | mmlu_conceptual_physics | 0.723404 | 235 | 170 |
| unsloth/Qwen3-4B-Thinking-2507 | electrical_engineering | mmlu_electrical_engineering | 0.751724 | 145 | 108 |
| unsloth/Qwen3-4B-Thinking-2507 | elementary_mathematics | mmlu_elementary_mathematics | 0.645503 | 378 | 244 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_biology | mmlu_high_school_biology | 0.812903 | 310 | 252 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_chemistry | mmlu_high_school_chemistry | 0.660099 | 203 | 134 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_computer_science | mmlu_high_school_computer_science | 0.81 | 100 | 81 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_mathematics | mmlu_high_school_mathematics | 0.440741 | 270 | 119 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_physics | mmlu_high_school_physics | 0.569536 | 151 | 86 |
| unsloth/Qwen3-4B-Thinking-2507 | high_school_statistics | mmlu_high_school_statistics | 0.652778 | 216 | 141 |
| unsloth/Qwen3-4B-Thinking-2507 | machine_learning | mmlu_machine_learning | 0.428571 | 112 | 48 |
Configuration
- Quantization: 4bit
- Temperature: 0.6
- Top P: 0.95
- Top K: 20
- Repetition Penalty: 1.1
This model was finetuned and converted to GGUF format using Unsloth.
Example usage:
- For text only LLMs: llama-cli --hf repo_id/model_name -p "why is the sky blue?"
- For multimodal models: llama-mtmd-cli -m model_name.gguf --mmproj mmproj_file.gguf
Ollama
An Ollama Modelfile is included for easy deployment.
- Downloads last month
- 211
3-bit
4-bit
8-bit
16-bit
Model tree for TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF
Base model
Qwen/Qwen3-4B-Thinking-2507

Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp# Start a local OpenAI-compatible server: llama-server -hf TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill-GGUF: