调用代码如何写奥？

by sunjunlishi - opened Apr 23, 2024

Apr 23, 2024

尝试了源地址的代码，也尝试基本的多模态调用方式，都不能成功。

ValueError: Calling cuda() is not supported for 4-bit or 8-bit quantized models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct dtype

FLYMANGO

Apr 24, 2024

mark一下，同时问一下运行此4bit模型需要多大显存gpu？

sunjunlishi

Apr 24, 2024

24G显存；现在仿照原版的可以加载，可以推理，但是推理结果为空

Winkuis

May 17, 2024

推理结果为空，是不是量化有问题

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment