--- base_model: Yano/exp-0212-001-alfworld-qwen2.5-7b datasets: - u-10bei/dbbench_sft_dataset_react - u-10bei/dbbench_sft_dataset_react_v2 - u-10bei/dbbench_sft_dataset_react_v3 - u-10bei/dbbench_sft_dataset_react_v4 language: - en license: apache-2.0 library_name: transformers pipeline_tag: text-generation tags: - qlora - lora - merged - dbbench - alfworld - agent --- # exp-0216-005-db-balanced-qwen2.5-7b Fine-tuned from **Yano/exp-0212-001-alfworld-qwen2.5-7b** (001 ALFWorld SFT model) using **QLoRA (4-bit, Unsloth)**. ## Purpose DB Bench training with balanced data (v1-v4 mixed, INSERT/UPDATE downsampled). Addresses 004's SELECT degradation (76.5% -> 41.0%) caused by INSERT/UPDATE data imbalance. ## Data Strategy - Combined all DB Bench v1-v4 (3,060 total) - Downsampled INSERT to 200, UPDATE to 150 - Kept all SELECT/query-type samples - Final dataset: ~1470 samples (INSERT+UPDATE ~24.5%) ## Training Configuration - Base model: Yano/exp-0212-001-alfworld-qwen2.5-7b - Method: QLoRA (4-bit), merged to 16-bit - Max sequence length: 2048 - Epochs: 3 - Learning rate: 2e-05 - LoRA: r=64, alpha=128 - Batch size: 2 (grad accum 16) - Warmup ratio: 0.1 - Collator: AllAssistantTurnsCollator (all turns supervised)