Alibaba-NLP
/

gte-base-en-v1.5

Sentence Similarity

sentence-transformers

Transformers.js

feature-extraction

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

Quantization

#16

by jntjdbhvebjynfbjdv - opened Feb 12, 2025

jntjdbhvebjynfbjdv

•

edited Feb 12, 2025

Is there any way to quantize the model (fp32->int8) weights and activation for gpu? (kind of like post static quantization)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment