Beambutbetter
/

Deepseek-V2-Lite-16B-NVFP4

Text Generation

8-bit precision

Model card Files Files and versions

Beambutbetter commited on May 19

Commit

6e19278

·

verified ·

1 Parent(s): ce1d15a

Update README.md

Files changed (1) hide show

README.md +29 -17

README.md CHANGED Viewed

@@ -1,40 +1,52 @@
 ---
 license: apache-2.0
 language:
-- th
 - en
 - zh
 base_model:
 - deepseek-ai/DeepSeek-V2-Lite
 pipeline_tag: text-generation
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
-## Model Details
-### Model Description
-DeepSeek-V2-Lite Quantized to NVFP4 Using TensorRT-Model-Optimizer. The model has 15.6B parameters in total, in Nvidia float precision 4 bit.
-Require Blackwell GPU, NVFP4 supported inference engine(if use with inference engine)
 - **Developed by:** Krisakorn Chanthasang
-- **Model type:** LLM TextGeneration
-- **Language(s) (NLP):** Chinese English Thai
 - **License:** Apache 2.0
-- **Quantized from model:** DeepSeek-V2-Lite
-### Model Information
-Read https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite for Basic model information
-## How to Get Started with the Model
-GPU requirement: Blackwell Series
-## Hardware
-Quantization Hardware: Nvidia RTX PRO 6000 Blackwell Workstation

 ---
 license: apache-2.0
 language:
 - en
 - zh
 base_model:
 - deepseek-ai/DeepSeek-V2-Lite
 pipeline_tag: text-generation
 ---
+# DeepSeek-V2-Lite NVFP4 Quantized
+This repository contains an NVFP4-quantized version of **DeepSeek-V2-Lite**, prepared using **TensorRT Model Optimizer**.
+The model has approximately **15.6B total parameters** and is quantized to **NVIDIA 4-bit floating-point precision (NVFP4)**.
+This is my first quantized model.
+## Model Details
 - **Developed by:** Krisakorn Chanthasang
+- **Model type:** Large Language Model for text generation
+- **Languages:** English, Chinese
 - **License:** Apache 2.0
+- **Base model:** deepseek-ai/DeepSeek-V2-Lite
+- **Quantization format:** NVFP4
+- **Quantization tool:** TensorRT Model Optimizer
+## Base Model Information
+For details about the original model architecture, training data, and intended usage, see the official base model page:
+https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite
+## Requirements
+This model requires hardware and inference software with **NVFP4 support**.
+### Hardware Requirement
+- NVIDIA Blackwell GPU or newer
+- Tested/quantized using: **NVIDIA RTX PRO 6000 Blackwell Workstation**
+## How to Use
+Use this model with an inference engine that supports **NVFP4** quantized models.
+Compatibility depends on your runtime, GPU architecture, and TensorRT/NVIDIA software stack.
+## Notes
+This is a quantized derivative of DeepSeek-V2-Lite. Accuracy, throughput, and memory usage may differ from the original model depending on the inference engine and hardware used.