Yano commited on
Commit
f684a16
·
verified ·
1 Parent(s): 30c435c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -12
README.md CHANGED
@@ -1,21 +1,48 @@
1
  ---
2
  base_model: Yano/exp-0212-001-alfworld-qwen2.5-7b
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - qwen2
8
- license: apache-2.0
9
  language:
10
  - en
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- # Uploaded finetuned model
 
 
 
 
 
 
 
 
 
14
 
15
- - **Developed by:** Yano
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** Yano/exp-0212-001-alfworld-qwen2.5-7b
 
18
 
19
- This qwen2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
1
  ---
2
  base_model: Yano/exp-0212-001-alfworld-qwen2.5-7b
3
+ datasets:
4
+ - u-10bei/dbbench_sft_dataset_react
5
+ - u-10bei/dbbench_sft_dataset_react_v2
6
+ - u-10bei/dbbench_sft_dataset_react_v3
7
+ - u-10bei/dbbench_sft_dataset_react_v4
 
8
  language:
9
  - en
10
+ license: apache-2.0
11
+ library_name: transformers
12
+ pipeline_tag: text-generation
13
+ tags:
14
+ - qlora
15
+ - lora
16
+ - merged
17
+ - dbbench
18
+ - alfworld
19
+ - agent
20
  ---
21
 
22
+ # exp-0216-005-db-balanced-qwen2.5-7b
23
+
24
+ Fine-tuned from **Yano/exp-0212-001-alfworld-qwen2.5-7b** (001 ALFWorld SFT model) using **QLoRA (4-bit, Unsloth)**.
25
+
26
+ ## Purpose
27
+
28
+ DB Bench training with balanced data (v1-v4 mixed, INSERT/UPDATE downsampled).
29
+ Addresses 004's SELECT degradation (76.5% -> 41.0%) caused by INSERT/UPDATE data imbalance.
30
+
31
+ ## Data Strategy
32
 
33
+ - Combined all DB Bench v1-v4 (3,060 total)
34
+ - Downsampled INSERT to 200, UPDATE to 150
35
+ - Kept all SELECT/query-type samples
36
+ - Final dataset: ~1470 samples (INSERT+UPDATE ~24.5%)
37
 
38
+ ## Training Configuration
39
 
40
+ - Base model: Yano/exp-0212-001-alfworld-qwen2.5-7b
41
+ - Method: QLoRA (4-bit), merged to 16-bit
42
+ - Max sequence length: 2048
43
+ - Epochs: 3
44
+ - Learning rate: 2e-05
45
+ - LoRA: r=64, alpha=128
46
+ - Batch size: 2 (grad accum 16)
47
+ - Warmup ratio: 0.1
48
+ - Collator: AllAssistantTurnsCollator (all turns supervised)