IQ2_M?

#1
by elpirater312 - opened

Would be nice ta have the abliterated IQ2_M quant, it best fits on a 256GB RAM System with a single 3090 at a decent quality. Thanks.

It's not just about the model weights; you also need to factor in the context window. Running a 1 million context length requires a completely different amount of GPU memory than a 256k one.

Sign up or log in to comment