Instructions to use LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2")
model = AutoModelForMultimodalLM.from_pretrained("LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2

SGLang

How to use LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2 with Docker Model Runner:
```
docker model run hf.co/LoneStriker/FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2
```

FrankenDPO-4x7B-bf16-8.0bpw-h8-exl2 / huggingface-metadata.txt

LoneStriker

Upload folder using huggingface_hub

f395c0b verified over 2 years ago

Raw

History Blame Contribute Delete

712 Bytes

	url: https://huggingface.co/Kquant03/FrankenDPO-4x7B-bf16
	branch: main
	download date: 2024-01-16 14:23:34
	sha256sum:
	4d35281a3e88f5a2966e48d6bfc4629fb21b3d4be53d3e7c3c82fbf97ae80a15 model-00001-of-00005.safetensors
	cdd61403c1ba57b311d4907972faf6d1f4e1ea40d71b44c4b894a68c2a05d218 model-00002-of-00005.safetensors
	4c12260fa6d13943d719abfe2e41fdcdeb672c79437ae2fb0f6cbdc868128630 model-00003-of-00005.safetensors
	b2ba2235cbfd7ac96b255774116eeebf8fef028f67706c6ec05d42f41facc061 model-00004-of-00005.safetensors
	219714d908032bdde0b1218e6ac09c80dbc4a6b437fa42502f4ce8032f414dea model-00005-of-00005.safetensors
	dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055 tokenizer.model