Image Feature Extraction
Transformers
Safetensors
internvl_chat
custom_code
4-bit precision
bitsandbytes
Instructions to use failspy/InternVL-Chat-V1-5-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use failspy/InternVL-Chat-V1-5-4bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-feature-extraction", model="failspy/InternVL-Chat-V1-5-4bit", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("failspy/InternVL-Chat-V1-5-4bit", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
调用代码如何写奥?
#1
by sunjunlishi - opened
尝试了源地址的代码,也尝试基本的多模态调用方式,都不能成功。
ValueError: Calling cuda() is not supported for 4-bit or 8-bit quantized models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct dtype
mark一下,同时问一下运行此4bit模型需要多大显存gpu?
24G显存;现在仿照原版的可以加载,可以推理,但是推理结果为空
推理结果为空,是不是量化有问题