Text Generation
Transformers
Safetensors
PyTorch
nemotron_h
nvidia
nemotron-3
latent-moe
mtp
conversational
custom_code
Eval Results
robgreenberg3 commited on
Commit
b4f6195
·
verified ·
1 Parent(s): 9a0b57e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -281,7 +281,7 @@ export MODEL_CKPT=PATH/TO/MODEL/CHECKPOINT
281
  ```bash
282
  # Optional: --enable-expert-parallel
283
  vllm serve $MODEL_CKPT \
284
- --served-model-name nvidia/nemotron-3-super \
285
  --async-scheduling \
286
  --dtype auto \
287
  --kv-cache-dtype fp8 \
@@ -316,7 +316,7 @@ For more detailed information, please see [this cookbook](https://github.com/NVI
316
  ```bash
317
  sglang serve \
318
  --model-path PATH/TO/CHECKPOINT \
319
- --served-model-name nvidia/nemotron-3-super \
320
  --trust-remote-code \
321
  --tp 8 \
322
  --ep 8 \
@@ -379,7 +379,7 @@ The examples below use the OpenAI-compatible client and work with any of the ser
379
  ```python
380
  from openai import OpenAI
381
  client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
382
- MODEL = "nvidia/nemotron-3-super"
383
  ```
384
 
385
  **Reasoning ON (default)**
@@ -446,7 +446,7 @@ Create or update your `~/.config/opencode/opencode.json`:
446
  },
447
  "models": {
448
  "nvidia-nemotron-3-super": {
449
- "name": "nvidia/nemotron-3-super",
450
  "limit": {
451
  "context": 1000000,
452
  "output": 32768
 
281
  ```bash
282
  # Optional: --enable-expert-parallel
283
  vllm serve $MODEL_CKPT \
284
+ --served-model-name RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 \
285
  --async-scheduling \
286
  --dtype auto \
287
  --kv-cache-dtype fp8 \
 
316
  ```bash
317
  sglang serve \
318
  --model-path PATH/TO/CHECKPOINT \
319
+ --served-model-name RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 \
320
  --trust-remote-code \
321
  --tp 8 \
322
  --ep 8 \
 
379
  ```python
380
  from openai import OpenAI
381
  client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
382
+ MODEL = "RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-BF16"
383
  ```
384
 
385
  **Reasoning ON (default)**
 
446
  },
447
  "models": {
448
  "nvidia-nemotron-3-super": {
449
+ "name": "RedHatAI/NVIDIA-Nemotron-3-Super-120B-A12B-BF16",
450
  "limit": {
451
  "context": 1000000,
452
  "output": 32768