Beambutbetter commited on
Commit
6e19278
·
verified ·
1 Parent(s): ce1d15a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -17
README.md CHANGED
@@ -1,40 +1,52 @@
1
  ---
2
  license: apache-2.0
3
  language:
4
- - th
5
  - en
6
  - zh
7
  base_model:
8
  - deepseek-ai/DeepSeek-V2-Lite
9
  pipeline_tag: text-generation
10
  ---
11
- # Model Card for Model ID
12
 
13
- <!-- Provide a quick summary of what the model is/does. -->
14
 
15
- This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
16
 
17
- ## Model Details
18
 
19
- ### Model Description
20
 
21
- DeepSeek-V2-Lite Quantized to NVFP4 Using TensorRT-Model-Optimizer. The model has 15.6B parameters in total, in Nvidia float precision 4 bit.
22
- Require Blackwell GPU, NVFP4 supported inference engine(if use with inference engine)
23
 
24
  - **Developed by:** Krisakorn Chanthasang
25
- - **Model type:** LLM TextGeneration
26
- - **Language(s) (NLP):** Chinese English Thai
27
  - **License:** Apache 2.0
28
- - **Quantized from model:** DeepSeek-V2-Lite
29
- ### Model Information
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- Read https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite for Basic model information
 
32
 
33
- ## How to Get Started with the Model
34
 
35
- GPU requirement: Blackwell Series
36
 
37
- ## Hardware
38
 
39
- Quantization Hardware: Nvidia RTX PRO 6000 Blackwell Workstation
40
 
 
 
1
  ---
2
  license: apache-2.0
3
  language:
 
4
  - en
5
  - zh
6
  base_model:
7
  - deepseek-ai/DeepSeek-V2-Lite
8
  pipeline_tag: text-generation
9
  ---
 
10
 
11
+ # DeepSeek-V2-Lite NVFP4 Quantized
12
 
13
+ This repository contains an NVFP4-quantized version of **DeepSeek-V2-Lite**, prepared using **TensorRT Model Optimizer**.
14
 
15
+ The model has approximately **15.6B total parameters** and is quantized to **NVIDIA 4-bit floating-point precision (NVFP4)**.
16
 
17
+ This is my first quantized model.
18
 
19
+ ## Model Details
 
20
 
21
  - **Developed by:** Krisakorn Chanthasang
22
+ - **Model type:** Large Language Model for text generation
23
+ - **Languages:** English, Chinese
24
  - **License:** Apache 2.0
25
+ - **Base model:** deepseek-ai/DeepSeek-V2-Lite
26
+ - **Quantization format:** NVFP4
27
+ - **Quantization tool:** TensorRT Model Optimizer
28
+
29
+ ## Base Model Information
30
+
31
+ For details about the original model architecture, training data, and intended usage, see the official base model page:
32
+
33
+ https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite
34
+
35
+ ## Requirements
36
+
37
+ This model requires hardware and inference software with **NVFP4 support**.
38
+
39
+ ### Hardware Requirement
40
 
41
+ - NVIDIA Blackwell GPU or newer
42
+ - Tested/quantized using: **NVIDIA RTX PRO 6000 Blackwell Workstation**
43
 
44
+ ## How to Use
45
 
46
+ Use this model with an inference engine that supports **NVFP4** quantized models.
47
 
48
+ Compatibility depends on your runtime, GPU architecture, and TensorRT/NVIDIA software stack.
49
 
50
+ ## Notes
51
 
52
+ This is a quantized derivative of DeepSeek-V2-Lite. Accuracy, throughput, and memory usage may differ from the original model depending on the inference engine and hardware used.