---
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_v5
- daichira/structured-hard-sft-4k
language:
- en
license: apache-2.0
library_name: peft
pipeline_tag: text-generation
tags:
- qlora
- lora
- structured-output
---

# qwen3-4b-h100-v5-hard-ep2

This LoRA adapter is highly optimized for complex structured outputs (JSON/XML/YAML/CSV/TOML).
Trained on H100 with a blend of two datasets (10k rows total) and heavily preprocessed to remove CoT traces and markdown artifacts.

## Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Max sequence length: 2560
- Epochs: 2
- Learning rate: 2e-5
- Effective Batch size: 32 (BS=16, GradAccum=2)
- Strategy: Phase 1 & Phase 2 mixed, custom output stripping applied.