Instructions to use Locutusque/gpt2-large-conversational-retrain with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Locutusque/gpt2-large-conversational-retrain with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Locutusque/gpt2-large-conversational-retrain")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Locutusque/gpt2-large-conversational-retrain")
model = AutoModelForMultimodalLM.from_pretrained("Locutusque/gpt2-large-conversational-retrain")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Locutusque/gpt2-large-conversational-retrain with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Locutusque/gpt2-large-conversational-retrain"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Locutusque/gpt2-large-conversational-retrain",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Locutusque/gpt2-large-conversational-retrain

SGLang

How to use Locutusque/gpt2-large-conversational-retrain with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Locutusque/gpt2-large-conversational-retrain" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Locutusque/gpt2-large-conversational-retrain",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Locutusque/gpt2-large-conversational-retrain" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Locutusque/gpt2-large-conversational-retrain",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Locutusque/gpt2-large-conversational-retrain with Docker Model Runner:
```
docker model run hf.co/Locutusque/gpt2-large-conversational-retrain
```

System messages

by msj121 - opened Nov 16, 2023

Discussion

msj121

Nov 16, 2023

Is there a way to send system messages, or just User and Assistant? Thanks.

Locutusque

Owner Nov 16, 2023

Unfortunately, there is no way to send system messages, but the model does sometimes to listen to text before the user and assistant messages. You can try putting a “system” prompt before the user and assistant messages as a loophole, but I’m not certain this will work. I will consider adding a system prompt on future versions of this model.

Locutusque changed discussion status to closed Nov 17, 2023

msj121

Nov 17, 2023

•

edited Nov 17, 2023

This has been working for me much better then other tiny models. So kudos! Gave it a like - hope you have more success.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment