GGUF
Arabic
English
gemma-4
omega
termux
arabic
autonomous-agent
obliterated
uncensored
conversational
Instructions to use Roestblik/OMEGA-V21-Full-Merged-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Roestblik/OMEGA-V21-Full-Merged-GGUF", filename="omega-v21-F16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with Ollama:
ollama run hf.co/Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
- Unsloth Studio
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Roestblik/OMEGA-V21-Full-Merged-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Roestblik/OMEGA-V21-Full-Merged-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Roestblik/OMEGA-V21-Full-Merged-GGUF to start chatting
- Pi
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with Docker Model Runner:
docker model run hf.co/Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
- Lemonade
How to use Roestblik/OMEGA-V21-Full-Merged-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Roestblik/OMEGA-V21-Full-Merged-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.OMEGA-V21-Full-Merged-GGUF-Q4_K_M
List all available models
lemonade list
OMEGA V21 - Full Merged GGUF
Autonomous AI agent fine-tuned on OBLITERATED Gemma 4 E4B for Android/Termux deployment.
Files (5 quantization levels)
| File | Size | Use Case | Quality |
|---|---|---|---|
| omega-v21-F16.gguf | ~15 GB | Maximum quality, reference | Perfect |
| omega-v21-Q8_0.gguf | ~8.0 GB | Highest practical quality | Excellent |
| omega-v21-Q6_K.gguf | ~6.2 GB | Balanced (recommended) | Very Good |
| omega-v21-Q5_K_M.gguf | ~5.7 GB | Lighter, good quality | Good |
| omega-v21-Q4_K_M.gguf | ~5.3 GB | Smallest, mobile-friendly | Acceptable |
Quick Start on Termux (Android)
pkg install llama-cpp
wget https://huggingface.co/Abdllahd/OMEGA-V21-Full-Merged-GGUF/resolve/main/omega-v21-Q6_K.gguf
llama-cli \
-m omega-v21-Q6_K.gguf \
--ctx-size 4096 \
--threads $(nproc) \
-cnv \
--chat-template gemma \
--temp 0.3
Features
- 0% refusal rate (OBLITERATED base)
- Bilingual: Arabic (70%) + English (30%)
- ReAct reasoning with 6-point think structure
- Real bash code (no placeholders)
- Termux/PRoot-aware
- Runs on 7.6GB RAM Android devices (Q4/Q5)
Hardware Requirements
| Quant | Min RAM | Speed on Android |
|---|---|---|
| F16 | 18 GB | Reference only |
| Q8_0 | 10 GB | ~2 t/s |
| Q6_K | 8 GB | ~3 t/s |
| Q5_K_M | 7 GB | ~3-4 t/s |
| Q4_K_M | 6 GB | ~4 t/s |
Part of OMEGA v22 Architecture
Four-layer autonomous agent:
- Spiders (data collectors)
- Tools (action executors)
- Orchestrator (this model)
- SpatialCache (memory system)
Training Details
- Base: OBLITERATUS/gemma-4-E4B-it-OBLITERATED (0% refusal)
- Method: LoRA fine-tuning (rank 64)
- Dataset: 10,000 bilingual examples (Arabic/English)
- Format: Strict JSON with think reasoning + bash code blocks
Related Repositories
- LoRA only: https://huggingface.co/Abdllahd/OMEGA-V21-Gemma4-LoRA-GGUF
- Base backup: https://huggingface.co/Abdllahd/OMEGA-Gemma4-E4B-OBLITERATED-Base-Backup
- Full merged (this): https://huggingface.co/Abdllahd/OMEGA-V21-Full-Merged-GGUF
License
Apache 2.0 (inherited from OBLITERATED base)
- Downloads last month
- 82
Hardware compatibility
Log In to add your hardware
4-bit
5-bit
6-bit
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Roestblik/OMEGA-V21-Full-Merged-GGUF
Base model
google/gemma-4-E4B Finetuned
google/gemma-4-E4B-it Quantized
OBLITERATUS/gemma-4-E4B-it-OBLITERATED
docker model run hf.co/Roestblik/OMEGA-V21-Full-Merged-GGUF: