noir-verdict-nemotron-4b-merged

How to use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "sankalphs/noir-verdict-nemotron-4b-merged"
tok = AutoTokenizer.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo, torch_dtype=torch.bfloat16, trust_remote_code=True,
).cuda().eval()

Chat template

The chat template is the Nemotron 3 chat template, with enable_thinking=False baked in. The system prompt for an active interrogation is built by engine/prompts.py:build_system_prompt(...).

messages = [
    {"role": "system", "content": "You are Greta Lindholm, junior continuity writer at WJBK. ..."},
    {"role": "user",   "content": "Where were you at the time of the theft?"},
]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)

Inference tips

  • n_ctx ≥ 4096
  • temperature 0.6–0.7, top_p 0.9–0.95
  • max_new_tokens 180–280 per turn
  • Stop on <|im_end|>

How it was trained

  • Image: nvidia/cuda:12.8.1-devel-ubuntu22.04 + Python 3.13
  • Pip deps: torch>=2.8.0, triton>=3.4.0, unsloth_zoo[base] @ git+https://github.com/unslothai/unsloth-zoo, unsloth[base] @ git+https://github.com/unslothai/unsloth, --torch-backend=cu128
  • Native: causal-conv1d==1.6.2.post1 and mamba-ssm==2.3.2.post1 compiled from source with --no-build-isolation, CC=gcc, CXX=g++ (no prebuilt cu128 + Py3.13 wheel exists)
  • Trainer: TRL SFTTrainer, packing, bf16, Unsloth LoRA (r=16, alpha=32, lr=2e-4 cosine, bs=2 grad_accum=8, 240 steps)
  • Orchestrator: train/modal_finetune.py

5-case smoke results (A10G, --n-gpu-layers 99)

case suspect personality truth_mode failure_flags
0 Greta Lindholm nervous lie none
37 (37, 1) helpful partial_truth none
113 (113, 2) arrogant truth none
241 (241, 3) evasive deflect none
497 Greta Lindholm nervous lie none

Pace: ~125 tokens/sec on A10G. No role-token leaks, no leaked <think> blocks, no overlong generations.

Companion artifacts

License

Apache-2.0. The base Nemotron 3 Nano weights are governed by NVIDIA's model license; the adapter and training code in this repo are Apache-2.0.

Downloads last month
18
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sankalphs/noir-verdict-nemotron-4b-merged

Spaces using sankalphs/noir-verdict-nemotron-4b-merged 2