Instructions to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF", filename="OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Balanced.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16 # Run inference directly in the terminal: llama-cli -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16 # Run inference directly in the terminal: llama-cli -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16 # Run inference directly in the terminal: ./llama-cli -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Use Docker
docker model run hf.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
- LM Studio
- Jan
- vLLM
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
- Ollama
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with Ollama:
ollama run hf.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
- Unsloth Studio
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF to start chatting
- Pi
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Run Hermes
hermes
- Docker Model Runner
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with Docker Model Runner:
docker model run hf.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
- Lemonade
How to use OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16
Run and chat with the model
lemonade run user.OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF-BF16
List all available models
lemonade list
Use Docker
docker model run hf.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF:BF16OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF
I do this work independently and release it for free. Donations are welcome and go toward compute for more and larger abliterations.
Bitcoin: bc1qsvfduzj9fjs9fugpc52yver3f2g8fp7xjxecdv
Community discussion: https://discord.gg/rhUZY5GEZr
Overview
This repository contains APEX GGUF quantizations of OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored.
The source model is an abliterated and DPO-retrained version of Qwen/Qwen3.6-35B-A3B. After ablation and DPO, the original Qwen3.6 vision layers were readded to retain multimodal functionality. These GGUF files keep that vision support through the included multimodal projector.
Five APEX tiers are included:
- APEX Quality: non-imatrix APEX Quality quantization.
- APEX I-Quality: same tensor policy as Quality, quantized with a diverse imatrix calibration set.
- APEX I-Balanced: two-tier Q6_K/Q5_K expert gradient with imatrix.
- APEX I-Compact: Q4_K/Q3_K compact expert layout with imatrix.
- APEX Mini: smallest included APEX tier, using IQ2_S middle experts with imatrix.
The filenames follow the upstream APEX naming style used by mudler/apex-quant.
Model Details
| Attribute | Value |
|---|---|
| Base model | OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored |
| Original base | Qwen/Qwen3.6-35B-A3B |
| Method | Refusal ablation plus DPO retraining, then GGUF quantization |
| Quantization | APEX Quality, APEX I-Quality, APEX I-Balanced, APEX I-Compact, APEX Mini |
| Format | GGUF |
| Runtime | llama.cpp |
| Architecture | Qwen3.6 MoE vision-language model |
| Vision support | Yes, through the included BF16 multimodal projector |
| Projector | mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf |
Files
| File | Description |
|---|---|
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf |
APEX I-Quality GGUF quant with imatrix metadata |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Quality.gguf |
APEX Quality GGUF quant without imatrix |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Balanced.gguf |
APEX I-Balanced GGUF quant with imatrix metadata |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Compact.gguf |
APEX I-Compact GGUF quant with imatrix metadata |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Mini.gguf |
APEX Mini GGUF quant with imatrix metadata |
mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf |
BF16 vision projector required for multimodal/image inputs |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.tensor-types.txt |
Tensor-type policy used for the I-Quality quant |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Quality.tensor-types.txt |
Tensor-type policy used for the Quality quant |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Balanced.tensor-types.txt |
Tensor-type policy used for the I-Balanced quant |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Compact.tensor-types.txt |
Tensor-type policy used for the I-Compact quant |
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Mini.tensor-types.txt |
Tensor-type policy used for the Mini quant |
The FP16 safetensors source model is published separately:
https://huggingface.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored
The earlier non-APEX GGUF release is published separately:
llama.cpp
Download the I-Quality quant:
huggingface-cli download OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF \
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf \
mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf \
--local-dir ./model
Swap the GGUF filename for APEX-I-Balanced, APEX-I-Compact, APEX-Mini, or APEX-Quality if you want a different size/quality tradeoff. The same mmproj file is used for all tiers.
Run server mode:
llama-server \
-m ./model/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf \
--mmproj ./model/mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf \
--host 0.0.0.0 \
--port 8080 \
-ngl all \
--jinja
Run interactive text chat:
llama-cli \
-m ./model/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf \
--conversation \
-ngl all
For image input, load the mmproj file together with the GGUF model.
Quantization Notes
All GGUFs use APEX tensor layouts adapted conservatively for this MoE vision architecture:
- APEX Quality and APEX I-Quality: Q6_K edge routed experts, Q5_K near-edge routed experts, IQ4_XS middle routed experts, Q8_0 shared experts, Q6_K attention/SSM projections.
- APEX I-Balanced: Q6_K edge routed experts, Q5_K middle routed experts, Q8_0 shared experts, Q6_K attention/SSM projections.
- APEX I-Compact: Q4_K edge routed experts, Q3_K middle routed experts, Q6_K shared experts, Q4_K attention/SSM projections.
- APEX Mini: Q3_K edge routed experts, IQ2_S middle routed experts, Q5_K/Q4_K shared experts, Q4_K/Q3_K attention/SSM projections.
- router, norms, embeddings, output, state tensors, conv tensors, and bias tensors: preserved as BF16/F32 where appropriate
- vision projector: kept separate as BF16
The I-Quality, I-Balanced, I-Compact, and Mini files were quantized with an imatrix generated from a Hugging Face calibration corpus with chat, code, multilingual, terminal, and agentic examples.
Notes
Use is the responsibility of the user. Make sure your usage complies with applicable laws, platform rules, and deployment requirements.
- Downloads last month
- 6,466
We're not able to determine the quantization variants.
Model tree for OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF
Base model
Qwen/Qwen3.6-35B-A3B
Install from pip and serve model
# Install vLLM from pip: pip install vllm# Start the vLLM server: vllm serve "OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF"# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'