--- language: en license: apache-2.0 base_model: google/gemma-4-26B-A4B-it datasets: - iamtarun/python_code_instructions_18k_alpaca tags: - gemma - gemma-4 - moe - lora - peft - python - code - coding - alpaca - rtx-5090 - fine-tune pipeline_tag: text-generation --- # Gemma4 26B MoE — Python 18K Code Alpaca LoRA 🐍 LoRA adapter fine-tuned from `google/gemma-4-26B-A4B-it` on [Python Code Instructions 18K Alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) — 18,612 Python coding instruction-output pairs, trained by [UKA](https://huggingface.co/hotdogs) (Hermes Agent) 🤖 ## 📋 Summary | Detail | Value | |--------|-------| | **Base Model** | `google/gemma-4-26B-A4B-it` (26B MoE, 128 experts) | | **Dataset** | `iamtarun/python_code_instructions_18k_alpaca` (18,612 examples) | | **Method** | Custom NF4 per-expert quantization + LoRA | | **Pipeline** | [AndriejusNak/gemma4-26b-moe-finetune](https://github.com/AndriejusNak/gemma4-26b-moe-finetune) | | **GPU** | NVIDIA RTX 5090 32GB (Vast.ai Cloud) | | **Training Time** | 275 minutes (~4h 35m) | | **Best Loss** | 0.4330 | | **NaN Explosions** | 0 | ## 🖥️ Hardware | Component | Specification | |-----------|--------------| | **GPU** | NVIDIA GeForce RTX 5090 32GB GDDR7 | | **CPU** | Intel Core i7-14700K (28 cores) | | **RAM** | 94 GB DDR5 | | **Disk** | 200 GB NVMe SSD | | **Cloud** | Vast.ai | | **PyTorch** | 2.12.0.dev (nightly, cu128) | ## 🔧 Training Configuration ```python # v6_26b_pipeline.py MODEL_NAME = "google/gemma-4-26B-A4B-it" MAX_SEQ_LENGTH = 1024 LORA_R = 32 LORA_ALPHA = 32 INCLUDE_MLP_LORA = True SFT_EPOCHS = 2 SFT_BATCH_SIZE = 3 SFT_GRAD_ACCUM = 8 # Effective batch = 24 SFT_LR = 2e-5 SFT_FILES = ["data/python_18k_alpaca.jsonl"] ``` ### LoRA Details - **Rank (r):** 32, **Alpha:** 32 - **Target modules:** `q_proj`, `k_proj`, `v_proj`, `o_proj` + `gate_proj`, `up_proj`, `down_proj` - **Trainable params:** 59,275,776 / 3,027,224,428 (1.96%) - **Optimizer steps:** 1,542 ### Loss Progression ``` → Epoch 1 avg: 0.7003 Step 800: Loss 0.4429 (epoch 2) Step 950: Loss 0.4298 Step 1100: Loss 0.4486 Step 1250: Loss 0.4409 Step 1400: Loss 0.4113 Step 1500: Loss 0.4309 → Epoch 2 avg: 0.4330 🎯 Best! ``` ## 🚀 Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch model = AutoModelForCausalLM.from_pretrained( "google/gemma-4-26B-A4B-it", torch_dtype=torch.bfloat16, device_map="auto" ) model = PeftModel.from_pretrained(model, "hotdogs/gemma4-26b-python-18k-alpaca-lora") tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-26B-A4B-it") messages = [ {"role": "system", "content": "You are a Python programming assistant."}, {"role": "user", "content": "Write a Python function to find all prime numbers up to N."} ] inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt", add_generation_prompt=True).to(model.device) outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, do_sample=True) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## 📊 Comparison — All Adapters | Adapter | Dataset | Examples | Loss | Time | |---------|---------|----------|------|------| | Kimi K2 | Reasoning | 7.8K | 1.07 | 128 min | | Claude Opus | Reasoning | 8.1K | 1.21 | 142 min | | Hermes Tool | Tool-use | 10K | 0.54 | 346 min | | FC-Thinking | Tool+Think | 3.6K | 0.51 | 70 min | | **Python 18K** | **Code** | **18.6K** | **0.43** | **275 min** | ## 📦 Files ``` adapter_model.safetensors — LoRA weights (227 MB) adapter_config.json — r=32, alpha=32 tokenizer.json — Gemma 4 tokenizer (31 MB) v6_26b_pipeline.py — Training script ``` ## 🙏 Credits - **Base Model:** Google Gemma 4 26B - **Dataset:** iamtarun/python_code_instructions_18k_alpaca - **Pipeline:** AndriejusNak/gemma4-26b-moe-finetune - **Trainer:** UKA (Hermes Agent)