How to use from
Ollama
ollama run hf.co/RemySkye/rwkv7-g1g-7.2b-GGUF:
Quick Links

RWKV7-G1G 7.2B - GGUF Quantizations

GGUF conversions of BlinkDL/rwkv7-g1g-7.2b (Apache 2.0)
for use with llama.cpp, LM Studio, and other GGUF-compatible runtimes.

All quants converted from BF16 base for maximum lossless quality.
Static quantizations โ€” no importance matrix (imatrix) was used. Check out this for imatrix quantizations: RemySkye/rwkv7-g1g-7.2b-i1-GGUF

S/M/L K-quant variants had no effect on size for RWKV architecture โ€” only _M variants are included.

See UncheatableEval for benchmarks.

Note: Q4_0, Q3_K_M, and Q2_K were tested but produced completely unintelligible / broken output without an importance matrix. For lower BPW quantizations (below ~4.5 BPW), an imatrix would be required โ€” these quants currently use no imatrix at all. Stick to Q4_K_M or higher for usable results.

Downloads last month
1,001
GGUF
Model size
7B params
Architecture
rwkv7
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for RemySkye/rwkv7-g1g-7.2b-GGUF

Base model

BlinkDL/rwkv7-g1
Quantized
(16)
this model