Text Generation
PEFT
Safetensors
English
gemma
gemma-4
Mixture of Experts
lora
python
code
coding
alpaca
rtx-5090
fine-tune
conversational
Instructions to use hotdogs/gemma4-26b-python-18k-alpaca-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use hotdogs/gemma4-26b-python-18k-alpaca-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/gemma-4-26B-A4B-it") model = PeftModel.from_pretrained(base_model, "hotdogs/gemma4-26b-python-18k-alpaca-lora") - Notebooks
- Google Colab
- Kaggle
metadata
language: en
license: apache-2.0
base_model: google/gemma-4-26B-A4B-it
datasets:
- iamtarun/python_code_instructions_18k_alpaca
tags:
- gemma
- gemma-4
- moe
- lora
- peft
- python
- code
- coding
- alpaca
- rtx-5090
- fine-tune
pipeline_tag: text-generation
Gemma4 26B MoE β Python 18K Code Alpaca LoRA π
LoRA adapter fine-tuned from google/gemma-4-26B-A4B-it on Python Code Instructions 18K Alpaca β 18,612 Python coding instruction-output pairs, trained by UKA (Hermes Agent) π€
π Summary
| Detail | Value |
|---|---|
| Base Model | google/gemma-4-26B-A4B-it (26B MoE, 128 experts) |
| Dataset | iamtarun/python_code_instructions_18k_alpaca (18,612 examples) |
| Method | Custom NF4 per-expert quantization + LoRA |
| Pipeline | AndriejusNak/gemma4-26b-moe-finetune |
| GPU | NVIDIA RTX 5090 32GB (Vast.ai Cloud) |
| Training Time | 275 minutes (~4h 35m) |
| Best Loss | 0.4330 |
| NaN Explosions | 0 |
π₯οΈ Hardware
| Component | Specification |
|---|---|
| GPU | NVIDIA GeForce RTX 5090 32GB GDDR7 |
| CPU | Intel Core i7-14700K (28 cores) |
| RAM | 94 GB DDR5 |
| Disk | 200 GB NVMe SSD |
| Cloud | Vast.ai |
| PyTorch | 2.12.0.dev (nightly, cu128) |
π§ Training Configuration
# v6_26b_pipeline.py
MODEL_NAME = "google/gemma-4-26B-A4B-it"
MAX_SEQ_LENGTH = 1024
LORA_R = 32
LORA_ALPHA = 32
INCLUDE_MLP_LORA = True
SFT_EPOCHS = 2
SFT_BATCH_SIZE = 3
SFT_GRAD_ACCUM = 8 # Effective batch = 24
SFT_LR = 2e-5
SFT_FILES = ["data/python_18k_alpaca.jsonl"]
LoRA Details
- Rank (r): 32, Alpha: 32
- Target modules:
q_proj,k_proj,v_proj,o_proj+gate_proj,up_proj,down_proj - Trainable params: 59,275,776 / 3,027,224,428 (1.96%)
- Optimizer steps: 1,542
Loss Progression
β Epoch 1 avg: 0.7003
Step 800: Loss 0.4429 (epoch 2)
Step 950: Loss 0.4298
Step 1100: Loss 0.4486
Step 1250: Loss 0.4409
Step 1400: Loss 0.4113
Step 1500: Loss 0.4309
β Epoch 2 avg: 0.4330 π― Best!
π Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-26B-A4B-it",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(model, "hotdogs/gemma4-26b-python-18k-alpaca-lora")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-26B-A4B-it")
messages = [
{"role": "system", "content": "You are a Python programming assistant."},
{"role": "user", "content": "Write a Python function to find all prime numbers up to N."}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
π Comparison β All Adapters
| Adapter | Dataset | Examples | Loss | Time |
|---|---|---|---|---|
| Kimi K2 | Reasoning | 7.8K | 1.07 | 128 min |
| Claude Opus | Reasoning | 8.1K | 1.21 | 142 min |
| Hermes Tool | Tool-use | 10K | 0.54 | 346 min |
| FC-Thinking | Tool+Think | 3.6K | 0.51 | 70 min |
| Python 18K | Code | 18.6K | 0.43 | 275 min |
π¦ Files
adapter_model.safetensors β LoRA weights (227 MB)
adapter_config.json β r=32, alpha=32
tokenizer.json β Gemma 4 tokenizer (31 MB)
v6_26b_pipeline.py β Training script
π Credits
- Base Model: Google Gemma 4 26B
- Dataset: iamtarun/python_code_instructions_18k_alpaca
- Pipeline: AndriejusNak/gemma4-26b-moe-finetune
- Trainer: UKA (Hermes Agent)