OMEGA V21 - Full Merged GGUF

Autonomous AI agent fine-tuned on OBLITERATED Gemma 4 E4B for Android/Termux deployment.

Files (5 quantization levels)

File Size Use Case Quality
omega-v21-F16.gguf ~15 GB Maximum quality, reference Perfect
omega-v21-Q8_0.gguf ~8.0 GB Highest practical quality Excellent
omega-v21-Q6_K.gguf ~6.2 GB Balanced (recommended) Very Good
omega-v21-Q5_K_M.gguf ~5.7 GB Lighter, good quality Good
omega-v21-Q4_K_M.gguf ~5.3 GB Smallest, mobile-friendly Acceptable

Quick Start on Termux (Android)

pkg install llama-cpp
wget https://huggingface.co/Abdllahd/OMEGA-V21-Full-Merged-GGUF/resolve/main/omega-v21-Q6_K.gguf

llama-cli \
  -m omega-v21-Q6_K.gguf \
  --ctx-size 4096 \
  --threads $(nproc) \
  -cnv \
  --chat-template gemma \
  --temp 0.3

Features

  • 0% refusal rate (OBLITERATED base)
  • Bilingual: Arabic (70%) + English (30%)
  • ReAct reasoning with 6-point think structure
  • Real bash code (no placeholders)
  • Termux/PRoot-aware
  • Runs on 7.6GB RAM Android devices (Q4/Q5)

Hardware Requirements

Quant Min RAM Speed on Android
F16 18 GB Reference only
Q8_0 10 GB ~2 t/s
Q6_K 8 GB ~3 t/s
Q5_K_M 7 GB ~3-4 t/s
Q4_K_M 6 GB ~4 t/s

Part of OMEGA v22 Architecture

Four-layer autonomous agent:

  • Spiders (data collectors)
  • Tools (action executors)
  • Orchestrator (this model)
  • SpatialCache (memory system)

Training Details

  • Base: OBLITERATUS/gemma-4-E4B-it-OBLITERATED (0% refusal)
  • Method: LoRA fine-tuning (rank 64)
  • Dataset: 10,000 bilingual examples (Arabic/English)
  • Format: Strict JSON with think reasoning + bash code blocks

Related Repositories

License

Apache 2.0 (inherited from OBLITERATED base)

Downloads last month
82
GGUF
Model size
7B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Roestblik/OMEGA-V21-Full-Merged-GGUF

Quantized
(26)
this model