--- base_model: Qwen/Qwen3-4B-Instruct-2507 datasets: - u-10bei/structured_data_with_cot_dataset_v5 - daichira/structured-hard-sft-4k language: - en license: apache-2.0 library_name: peft pipeline_tag: text-generation tags: - qlora - lora - structured-output --- # qwen3-4b-h100-v5-hard-ep2 This LoRA adapter is highly optimized for complex structured outputs (JSON/XML/YAML/CSV/TOML). Trained on H100 with a blend of two datasets (10k rows total) and heavily preprocessed to remove CoT traces and markdown artifacts. ## Training Configuration - Base model: Qwen/Qwen3-4B-Instruct-2507 - Max sequence length: 2560 - Epochs: 2 - Learning rate: 2e-5 - Effective Batch size: 32 (BS=16, GradAccum=2) - Strategy: Phase 1 & Phase 2 mixed, custom output stripping applied.