--- base_model: nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 library_name: transformers tags: - llama-factory - lora-merged - bf16 - text-generation --- # Nemotron Super 49B AIE v11 5500 Merged BF16 This is a merged bf16 export of `nvidia/Llama-3_3-Nemotron-Super-49B-v1_5` with the local AIE v11 LoRA adapter merged into the base weights. Training summary: - Dataset: `aie_v11`, 5,227 examples. - Stage 1: 4-bit LoRA SFT at `cutoff_len=4096`, 2 epochs. - Stage 2: continued from the stage 1 adapter at `cutoff_len=5500`, 2 epochs. - Stage 2 learning rate: `3e-5`. - Stage 2 final train loss: `0.1407`. - LoRA target modules: `q_proj,v_proj`. - Export dtype: `bfloat16`. The merged model was exported with LLaMA-Factory and split into 21 safetensors shards.