fieldvalley-llm2025 commited on
Commit
42a72bb
·
verified ·
1 Parent(s): b1d006c

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +31 -14
README.md CHANGED
@@ -1,22 +1,39 @@
1
  ---
2
- base_model: unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit
 
 
3
  tags:
4
- - text-generation-inference
5
- - transformers
6
  - unsloth
7
- - qwen3
8
- - trl
9
- license: apache-2.0
10
- language:
11
- - en
 
 
 
12
  ---
13
 
14
- # Uploaded model
15
 
16
- - **Developed by:** fieldvalley-llm2025
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit
19
 
20
- This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: Qwen/Qwen3-4B-Instruct-2507
3
+ library_name: peft
4
+ license: other
5
  tags:
 
 
6
  - unsloth
7
+ - lora
8
+ - sft
9
+ - completion-only
10
+ - struct-eval
11
+ - hard-filter
12
+ model-index:
13
+ - name: main_rev2_sft04
14
+ results: []
15
  ---
16
 
17
+ # main_rev2_sft04
18
 
19
+ This is a **Safe SFT** LoRA adapter (REV2 SFT04).
20
+ It uses **Completion-only Training** and **Hard SFT Filtering**.
 
21
 
22
+ ## Base Model
23
+ Qwen/Qwen3-4B-Instruct-2507
24
 
25
+ ## Training Data (Mixed 65:35)
26
+ - 65%: daichira/structured-hard-sft-4k (Filtered High Quality)
27
+ - 35%: u-10bei/structured_data_with_cot_dataset_512_v4 (Filtered Output-only)
28
+
29
+ ## Hard Filter Applied
30
+ - Length Limit (Format-wise)
31
+ - Anti-Log/Audit Keywords
32
+ - Repetition Check (Tokens, Lines, Gases)
33
+
34
+ ## Method
35
+ - **Completion-only**: User prompts are masked (-100 output label).
36
+ - **Marker**: `
37
+ ### OUTPUT
38
+ ` inserted before assistant output.
39
+ - **Config**: 1 Epoch, Max Seq Length 4096.