--- license: apache-2.0 language: - en - zh base_model: - deepseek-ai/DeepSeek-V2-Lite pipeline_tag: text-generation --- # DeepSeek-V2-Lite NVFP4 Quantized This repository contains an NVFP4-quantized version of **DeepSeek-V2-Lite**, prepared using **TensorRT Model Optimizer**. The model has approximately **15.6B total parameters** and is quantized to **NVIDIA 4-bit floating-point precision (NVFP4)**. This is my first quantized model. ## Model Details - **Developed by:** Krisakorn Chanthasang - **Model type:** Large Language Model for text generation - **Languages:** English, Chinese - **License:** Apache 2.0 - **Base model:** deepseek-ai/DeepSeek-V2-Lite - **Quantization format:** NVFP4 - **Quantization tool:** TensorRT Model Optimizer ## Base Model Information For details about the original model architecture, training data, and intended usage, see the official base model page: https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite ## Requirements This model requires hardware and inference software with **NVFP4 support**. ### Hardware Requirement - NVIDIA Blackwell GPU or newer - Tested/quantized using: **NVIDIA RTX PRO 6000 Blackwell Workstation** ## How to Use Use this model with an inference engine that supports **NVFP4** quantized models. Compatibility depends on your runtime, GPU architecture, and TensorRT/NVIDIA software stack. ## Notes This is a quantized derivative of DeepSeek-V2-Lite. Accuracy, throughput, and memory usage may differ from the original model depending on the inference engine and hardware used.