--- base_model: Qwen/Qwen3-4B-Instruct-2507 library_name: peft license: other tags: - unsloth - lora - sft - completion-only - struct-eval - hard-filter model-index: - name: main_rev2_sft04 results: [] --- # main_rev2_sft04 This is a **Safe SFT** LoRA adapter (REV2 SFT04). It uses **Completion-only Training** and **Hard SFT Filtering**. ## Base Model Qwen/Qwen3-4B-Instruct-2507 ## Training Data (Mixed 65:35) - 65%: daichira/structured-hard-sft-4k (Filtered High Quality) - 35%: u-10bei/structured_data_with_cot_dataset_512_v4 (Filtered Output-only) ## Hard Filter Applied - Length Limit (Format-wise) - Anti-Log/Audit Keywords - Repetition Check (Tokens, Lines, Gases) ## Method - **Completion-only**: User prompts are masked (-100 output label). - **Marker**: ` ### OUTPUT ` inserted before assistant output. - **Config**: 1 Epoch, Max Seq Length 4096.