OfflineAid — Gemma 4 E2B fine-tune (merged fp16 safetensors)

Stage-1 fine-tune of unsloth/gemma-4-E2B-it for the OfflineAid project — an offline AI assistant for Australian consumer-safety scenarios (anti-scam, disaster, travel) targeting the Kaggle Gemma 4 Good Hackathon.

This repo contains the fully merged fp16 safetensors (3 shards, ~9.5 GB) produced by peft.PeftModel.merge_and_unload() from the LoRA adapter at helenk/gemma-4-E2B-lora. For the Q4_K_M GGUF quantization, see helenk/gemma-4-E2B-finetune-GGUF.

The E2B variant is the smaller sibling of helenk/gemma-4-E4B-finetune. It targets lower-memory inference (Pixel 7 class) and was trained on the same corpus with identical hyperparameters via scripts/render_finetune_variant.py.

Training

  • Method: Unsloth LoRA on Kaggle T4
  • Adapter source: helenk/gemma-4-E2B-lora
  • Base: unsloth/gemma-4-E2B-it
  • LoRA config: r=16, α=16, dropout=0, target = q,k,v,o,gate,up,down_proj, vision layers off
  • Chat template: gemma-4-thinking
  • Loss: train_on_responses_only (mask user + evidence turn, train only on answer tokens)
  • Data: helenkwok/offlineaid v3 — 1,002-row train split (90/10 stratified per-language EN/ZH/AR from 1,113-row total, seed=3407). Each row is {instruction, input (= verbatim evidence_quote from .gov.au source), output (grounded answer in target language), language}.
  • Hyperparams: 2 epochs, batch 8 (per_device 2 × grad_accum 4), lr 2e-5, warmup 5, fp16, weight decay 0.01, seed 3407
  • Notebook: Identical canonical notebook as E4B with a single MODEL_VARIANT = "E2B" switch.

Merge recipe

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-E2B-it", torch_dtype="float16", device_map="cpu")
# Unsloth wraps Linear layers in Gemma4ClippableLinear — strip before peft injection
# (see scripts/merge_e2b_lora.py for the unwrap helper)
model = PeftModel.from_pretrained(base, "helenk/gemma-4-E2B-lora").merge_and_unload()
model.save_pretrained("gemma-4-E2B-offlineaid-merged", safe_serialization=True, max_shard_size="5GB")
AutoTokenizer.from_pretrained("helenk/gemma-4-E2B-lora").save_pretrained("gemma-4-E2B-offlineaid-merged")

Full script: scripts/merge_e2b_lora.py.

Tier A held-out eval

The Tier A eval was run against the E4B variant (writeup-grade evidence pillar). E2B numbers are not separately reported here; see helenk/gemma-4-E4B-finetune-GGUF for the methodology and results table. Both variants share identical training data, recipe, and merge pipeline.

Intended use

Lower-memory variant of the OfflineAid Stage-1 fine-tune. Pixel 7 production deployment uses stock gemma-4-E2B-it.litertlm plus retrieval, not this fine-tune; see the project writeup for the architectural rationale.

License

Inherits Google's Gemma Terms of Use. Training data (helenkwok/offlineaid) is CC-BY-4.0.

Sibling repos

Downloads last month
34
Safetensors
Model size
5B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for helenk/gemma-4-E2B-finetune

Adapter
(32)
this model

Collection including helenk/gemma-4-E2B-finetune