Update README.md
Browse files
README.md
CHANGED
|
@@ -10,7 +10,10 @@ tags:
|
|
| 10 |
|
| 11 |
## Simple Llama 3.1 8B-Instruct model quantized using GPTQ v1 with C2/en 256 rows of calibration data
|
| 12 |
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
## Eval Script using GPTQModel (main branch) and Marlin kernel + lm-eval (main branch)
|
| 16 |
|
|
|
|
| 10 |
|
| 11 |
## Simple Llama 3.1 8B-Instruct model quantized using GPTQ v1 with C2/en 256 rows of calibration data
|
| 12 |
|
| 13 |
+
|
| 14 |
+
This is not a production ready quant model but one used to evaluate GPTQ v1 vs GPTQ v2 for post-quant comparison.
|
| 15 |
+
|
| 16 |
+
GPTQ v2 is hosted at: https://huggingface.co/ModelCloud/GPTQ-v2-Llama-3.1-8B-Instruct
|
| 17 |
|
| 18 |
## Eval Script using GPTQModel (main branch) and Marlin kernel + lm-eval (main branch)
|
| 19 |
|