ubergarm commited on
Commit
9a168a7
·
1 Parent(s): 910390b

uploading custom Q3_K imatirx quantized mix

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -22,8 +22,10 @@ Check the `logs/` directory for details on imatrix calculation.
22
 
23
  I'll upload more if/when ik_llama.cpp support is merged.
24
 
 
 
25
  ## Q3_K 179.97 GiB (3.90 BPW)
26
- TODO
27
 
28
  <details>
29
 
 
22
 
23
  I'll upload more if/when ik_llama.cpp support is merged.
24
 
25
+ It seems to inference very slowly on CPU-only and probably requires at least one GPU to handle attention/kv-cache/delta-net stuff as it is much faster even hybrid CPU+GPU.
26
+
27
  ## Q3_K 179.97 GiB (3.90 BPW)
28
+ TODO Perplexity Calculations
29
 
30
  <details>
31