Quantization

#16
by jntjdbhvebjynfbjdv - opened

Is there any way to quantize the model (fp32->int8) weights and activation for gpu? (kind of like post static quantization)

Sign up or log in to comment