Methodology?

#1
by JDWarner - opened

I noticed this wasn't part of the official release package, and considered making one with Intel Autoround. The Readme doesn't have any info about methods or reproducibility.

Would you consider adding that to the Readme and here? Specifically interested in the package and flags used, as well as calibration data.

This was created by using the tokenizer and config from google/gemma-4-26B-A4B-it-qat-q4_0-unquantized, copying and processing the weights from google/gemma-4-26B-A4B-it-qat-q4_0-gguf, and adding some necessary padding for vLLM. Therefore, no calibration data or flags related.

Sign up or log in to comment