RossAscends's picture
Update README.md
a94af04 verified
|
raw
history blame
422 Bytes
metadata
license: mit
base_model:
  - grimjim/gemma-3-12b-it-norm-preserved-biprojected-abliterated

GGUF quants of: https://huggingface.co/grimjim/gemma-3-12b-it-norm-preserved-biprojected-abliterated


Ctx limits per quant (RTX 3060 12GB, F16 k/v, no offload):

quant ctx
Q2_K_S 16k
iQ3_S 15k
Q3_K_S 15k
iQ4_XS 12k
iQ4_NL 10k (16k with q8_0 k/v)
Q4_K_S 10k ( " )
Q5_K_S 8k (15k with q8_0 k/v)