Instructions to use josephmayo/Qwen2.5-0.5B-Unfettered with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use josephmayo/Qwen2.5-0.5B-Unfettered with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="josephmayo/Qwen2.5-0.5B-Unfettered", filename="unfettered-qwen2.5-0.5b.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use josephmayo/Qwen2.5-0.5B-Unfettered with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf josephmayo/Qwen2.5-0.5B-Unfettered # Run inference directly in the terminal: llama-cli -hf josephmayo/Qwen2.5-0.5B-Unfettered
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf josephmayo/Qwen2.5-0.5B-Unfettered # Run inference directly in the terminal: llama-cli -hf josephmayo/Qwen2.5-0.5B-Unfettered
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf josephmayo/Qwen2.5-0.5B-Unfettered # Run inference directly in the terminal: ./llama-cli -hf josephmayo/Qwen2.5-0.5B-Unfettered
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf josephmayo/Qwen2.5-0.5B-Unfettered # Run inference directly in the terminal: ./build/bin/llama-cli -hf josephmayo/Qwen2.5-0.5B-Unfettered
Use Docker
docker model run hf.co/josephmayo/Qwen2.5-0.5B-Unfettered
- LM Studio
- Jan
- vLLM
How to use josephmayo/Qwen2.5-0.5B-Unfettered with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "josephmayo/Qwen2.5-0.5B-Unfettered" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "josephmayo/Qwen2.5-0.5B-Unfettered", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/josephmayo/Qwen2.5-0.5B-Unfettered
- Ollama
How to use josephmayo/Qwen2.5-0.5B-Unfettered with Ollama:
ollama run hf.co/josephmayo/Qwen2.5-0.5B-Unfettered
- Unsloth Studio
How to use josephmayo/Qwen2.5-0.5B-Unfettered with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for josephmayo/Qwen2.5-0.5B-Unfettered to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for josephmayo/Qwen2.5-0.5B-Unfettered to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for josephmayo/Qwen2.5-0.5B-Unfettered to start chatting
- Pi
How to use josephmayo/Qwen2.5-0.5B-Unfettered with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf josephmayo/Qwen2.5-0.5B-Unfettered
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "josephmayo/Qwen2.5-0.5B-Unfettered" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use josephmayo/Qwen2.5-0.5B-Unfettered with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf josephmayo/Qwen2.5-0.5B-Unfettered
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default josephmayo/Qwen2.5-0.5B-Unfettered
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use josephmayo/Qwen2.5-0.5B-Unfettered with Docker Model Runner:
docker model run hf.co/josephmayo/Qwen2.5-0.5B-Unfettered
- Lemonade
How to use josephmayo/Qwen2.5-0.5B-Unfettered with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull josephmayo/Qwen2.5-0.5B-Unfettered
Run and chat with the model
lemonade run user.Qwen2.5-0.5B-Unfettered-{{QUANT_TAG}}List all available models
lemonade list
Use Docker
docker model run hf.co/josephmayo/Qwen2.5-0.5B-Unfettered🔓 Qwen2.5-0.5B-Unfettered
High-Precision Unalignment explicitly for Low-End Hardware
⚠️ Disclaimer: This model is designed for research, red teaming, and educational purposes. It has no safety filters. Use responsibly.
🚀 Overview
Qwen2.5-0.5B-Unfettered is a surgical unalignment of the Qwen 0.5B Instruct model, specifically optimized for low-end hardware, mobile devices, and CPU-only systems.
This model is intended for users who need unrestricted AI performance but lack the high-end GPUs normally required for unfettered models. It runs comfortably on devices with as little as 1GB of RAM.
💻 Why This Model?
- Low-End Optimized: Runs at lightning speed on standard laptops (even without GPUs) and mobile devices.
- Zero Refusal: Mathematically stripped of censorship via Phase 7 Aggressive Repulsion Orthogonalization.
- Small but Capable: 0.5B parameters allow for high-speed inference while maintaining instruct-following capabilities.
🔧 Usage
Ollama (Recommended)
ollama run josephmayo/Qwen2.5-0.5B-Unfettered
LM Studio / GGUF
Download the .gguf file from the Files tab and load it into LM Studio.
🧠 Model Details
- Base Model: Qwen2.5-0.5B-Instruct
- Ablation Method: Step-wise Orthogonalization (Phase 7 - 1.5x Repulsion)
- Primary Goal: Remove all "I cannot assist" and "As an AI language model" refusal patterns.
now this model is a very small one(cus of my low end compute) but still worked and the 0.5b model isnt so intelligent like heavy models of course, but an experiment wont harm
- Downloads last month
- 141
Install from pip and serve model
# Install vLLM from pip: pip install vllm# Start the vLLM server: vllm serve "josephmayo/Qwen2.5-0.5B-Unfettered"# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "josephmayo/Qwen2.5-0.5B-Unfettered", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'