Instructions to use Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- vLLM
How to use Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8
- SGLang
How to use Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8 with Docker Model Runner:
docker model run hf.co/Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8
shuai bai commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -503,9 +503,10 @@ These limitations serve as ongoing directions for model optimization and improve
|
|
| 503 |
If you find our work helpful, feel free to give us a cite.
|
| 504 |
|
| 505 |
```
|
| 506 |
-
@article{
|
| 507 |
-
title={Qwen2-VL},
|
| 508 |
-
author={
|
|
|
|
| 509 |
year={2024}
|
| 510 |
}
|
| 511 |
|
|
|
|
| 503 |
If you find our work helpful, feel free to give us a cite.
|
| 504 |
|
| 505 |
```
|
| 506 |
+
@article{Qwen2VL,
|
| 507 |
+
title={Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution},
|
| 508 |
+
author={Wang, Peng and Bai, Shuai and Tan, Sinan and Wang, Shijie and Fan, Zhihao and Bai, Jinze and Chen, Keqin and Liu, Xuejing and Wang, Jialin and Ge, Wenbin and Fan, Yang and Dang, Kai and Du, Mengfei and Ren, Xuancheng and Men, Rui and Liu, Dayiheng and Zhou, Chang and Zhou, Jingren and Lin, Junyang},
|
| 509 |
+
journal={arXiv preprint arXiv:2409.12191},
|
| 510 |
year={2024}
|
| 511 |
}
|
| 512 |
|