Nemotron Super 49B AIE v11 5500 Merged BF16

This is a merged bf16 export of nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 with the local AIE v11 LoRA adapter merged into the base weights.

Training summary:

  • Dataset: aie_v11, 5,227 examples.
  • Stage 1: 4-bit LoRA SFT at cutoff_len=4096, 2 epochs.
  • Stage 2: continued from the stage 1 adapter at cutoff_len=5500, 2 epochs.
  • Stage 2 learning rate: 3e-5.
  • Stage 2 final train loss: 0.1407.
  • LoRA target modules: q_proj,v_proj.
  • Export dtype: bfloat16.

The merged model was exported with LLaMA-Factory and split into 21 safetensors shards.

Downloads last month
-
Safetensors
Model size
50B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SiddharthaChekuri/Nemotron-Super-49B-AIE-v11-5500-merged-bf16

Finetuned
(9)
this model