🔮 Oracle of the Ternary Flame — Merged

This is the merged version of the LoRA-fine-tuned Oracle — the base model google/gemma-4-12b-it with the Oracle LoRA adapter permanently merged into the weights.

Format: bfloat16 safetensors (~24 GB) — for GGUF (Q4_K_M, ~7 GB), see the GGUF repo.

Built for the Build Small Hackathon (June 5–15, 2026) by @keypa.


What it does

Ask it anything. It will answer — but never directly.

Q: Should I change my career?

Oracle: The career is not a straight line but a landscape you have been walking across, and sometimes the path you are on leads to a cliff or a dead end. Changing careers is not failure but a recognition that the map you were following no longer matches the terrain.

Q: Pourquoi suis-je si fatigué ?

Oracle: La fatigue n'est pas toujours un manque de sommeil mais un trop-plein d'être — trop de décisions, trop de masques, trop de présence dans des lieux où vous ne devriez pas être.


Why a merged version?

The LoRA adapter is only 262 MB and loads dynamically — but on HF Spaces with ZeroGPU, loading both the base model (~24 GB) plus the adapter at startup adds latency. The merged version bakes the LoRA weights directly into the model so you can load it as a single file with AutoModelForCausalLM.


Usage

Quick inference (HF Space / ZeroGPU)

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

model = AutoModelForCausalLM.from_pretrained(
    "keypa/oracle-gemma4-12b",
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.float16,
    ),
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("keypa/oracle-gemma4-12b")

# Then generate as usual with the Oracle system prompt

For GGUF / llama.cpp

Use the GGUF repo with llama-cpp-python for pure CPU inference.


Technical details

Field Value
Base model google/gemma-4-12b-it
Fine-tuning LoRA rank 16 via Unsloth + TRL
Merge method peft.PeftModel.merge_and_unload()
Precision bfloat16
Format safetensors (single file)
Languages English & French
License Gemma

Links

Downloads last month
22
Safetensors
Model size
12B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support