HamadaMayu
/

qwen3-4b-structured-output-lora-v4-2epoch

Text Generation

structured-output

Model card Files Files and versions

HamadaMayu commited on Feb 8

Commit

9d1888a

·

verified ·

1 Parent(s): cb5d95e

Update README.md

Files changed (1) hide show

README.md +16 -11

README.md CHANGED Viewed

@@ -15,10 +15,10 @@ tags:
 qwen3-4b-structured-output-lora-v4-2epoch
-This repository provides a **LoRA adapter** fine-tuned from
-**Qwen/Qwen3-4B-Instruct-2507** using **QLoRA (4-bit, Unsloth)**.
-This repository contains **LoRA adapter weights only**.
 The base model must be loaded separately.
 ## Training Objective
@@ -26,17 +26,18 @@ The base model must be loaded separately.
 This adapter is trained to improve **structured output accuracy**
 (JSON / YAML / XML / TOML / CSV).
-Loss is applied only to the final assistant output,
-while intermediate reasoning (Chain-of-Thought) is masked.
 ## Training Configuration
 - Base model: Qwen/Qwen3-4B-Instruct-2507
-- Method: QLoRA (4-bit)
 - Max sequence length: 2048
 - Epochs: 2
-- Learning rate: 2e-06
-- LoRA: r=64, alpha=128
 ## Usage
@@ -57,9 +58,13 @@ model = AutoModelForCausalLM.from_pretrained(
 model = PeftModel.from_pretrained(model, adapter)
 ```
-## Sources & Terms (IMPORTANT)
 Training data: u-10bei/structured_data_with_cot_dataset_512_v4
-Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
-Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.

 qwen3-4b-structured-output-lora-v4-2epoch
+This repository provides a **LoRA adapter** fine-tuned from
+**Qwen/Qwen3-4B-Instruct-2507** using **QLoRA (4-bit quantization) with Hugging Face Transformers and PEFT**.
+This repository contains **LoRA adapter weights only**.
 The base model must be loaded separately.
 ## Training Objective
 This adapter is trained to improve **structured output accuracy**
 (JSON / YAML / XML / TOML / CSV).
+Loss is applied **only to the final assistant output**,
+while intermediate reasoning (Chain-of-Thought) is **masked during training**.
 ## Training Configuration
 - Base model: Qwen/Qwen3-4B-Instruct-2507
+- Training framework: Hugging Face Transformers + PEFT
+- Method: QLoRA (4-bit quantization, NF4)
 - Max sequence length: 2048
 - Epochs: 2
+- Learning rate: 2e-6
+- LoRA configuration: r=64, alpha=128
 ## Usage
 model = PeftModel.from_pretrained(model, adapter)
 ```
+Sources & Terms (IMPORTANT)
 Training data: u-10bei/structured_data_with_cot_dataset_512_v4
+Dataset License: MIT License
+This dataset is used and distributed under the terms of the MIT License.
+Compliance:
+Users must comply with the MIT license (including copyright notice)
+and the base model's original terms of use.