Text Generation
PEFT
Safetensors
English
qlora
lora
structured-output
HamadaMayu commited on
Commit
9d1888a
·
verified ·
1 Parent(s): cb5d95e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -11
README.md CHANGED
@@ -15,10 +15,10 @@ tags:
15
 
16
  qwen3-4b-structured-output-lora-v4-2epoch
17
 
18
- This repository provides a **LoRA adapter** fine-tuned from
19
- **Qwen/Qwen3-4B-Instruct-2507** using **QLoRA (4-bit, Unsloth)**.
20
 
21
- This repository contains **LoRA adapter weights only**.
22
  The base model must be loaded separately.
23
 
24
  ## Training Objective
@@ -26,17 +26,18 @@ The base model must be loaded separately.
26
  This adapter is trained to improve **structured output accuracy**
27
  (JSON / YAML / XML / TOML / CSV).
28
 
29
- Loss is applied only to the final assistant output,
30
- while intermediate reasoning (Chain-of-Thought) is masked.
31
 
32
  ## Training Configuration
33
 
34
  - Base model: Qwen/Qwen3-4B-Instruct-2507
35
- - Method: QLoRA (4-bit)
 
36
  - Max sequence length: 2048
37
  - Epochs: 2
38
- - Learning rate: 2e-06
39
- - LoRA: r=64, alpha=128
40
 
41
  ## Usage
42
 
@@ -57,9 +58,13 @@ model = AutoModelForCausalLM.from_pretrained(
57
  model = PeftModel.from_pretrained(model, adapter)
58
  ```
59
 
60
- ## Sources & Terms (IMPORTANT)
61
 
62
  Training data: u-10bei/structured_data_with_cot_dataset_512_v4
63
 
64
- Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
65
- Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.
 
 
 
 
 
15
 
16
  qwen3-4b-structured-output-lora-v4-2epoch
17
 
18
+ This repository provides a **LoRA adapter** fine-tuned from
19
+ **Qwen/Qwen3-4B-Instruct-2507** using **QLoRA (4-bit quantization) with Hugging Face Transformers and PEFT**.
20
 
21
+ This repository contains **LoRA adapter weights only**.
22
  The base model must be loaded separately.
23
 
24
  ## Training Objective
 
26
  This adapter is trained to improve **structured output accuracy**
27
  (JSON / YAML / XML / TOML / CSV).
28
 
29
+ Loss is applied **only to the final assistant output**,
30
+ while intermediate reasoning (Chain-of-Thought) is **masked during training**.
31
 
32
  ## Training Configuration
33
 
34
  - Base model: Qwen/Qwen3-4B-Instruct-2507
35
+ - Training framework: Hugging Face Transformers + PEFT
36
+ - Method: QLoRA (4-bit quantization, NF4)
37
  - Max sequence length: 2048
38
  - Epochs: 2
39
+ - Learning rate: 2e-6
40
+ - LoRA configuration: r=64, alpha=128
41
 
42
  ## Usage
43
 
 
58
  model = PeftModel.from_pretrained(model, adapter)
59
  ```
60
 
61
+ Sources & Terms (IMPORTANT)
62
 
63
  Training data: u-10bei/structured_data_with_cot_dataset_512_v4
64
 
65
+ Dataset License: MIT License
66
+ This dataset is used and distributed under the terms of the MIT License.
67
+
68
+ Compliance:
69
+ Users must comply with the MIT license (including copyright notice)
70
+ and the base model's original terms of use.