Instructions to use Xenova/stablelm-2-1_6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use Xenova/stablelm-2-1_6b with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-generation', 'Xenova/stablelm-2-1_6b');
Quantized model too big?
#1
by D4ve-R - opened
Hi, I noticed that the quantized model is 1,4 gb and the normal model only 140 mb. Shouldn’t it be the other way ?
The unquantized model is actually around ~6.5GB: the weights (decoder_model_merged.onnx_data) are stored separately due to protobuf limitations.
Ok, got you! Sorry, seems like I need to do some research on how onnx works.
D4ve-R changed discussion status to closed