Instructions to use humarin/chatgpt_paraphraser_on_T5_base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use humarin/chatgpt_paraphraser_on_T5_base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="humarin/chatgpt_paraphraser_on_T5_base")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base") model = AutoModelForSeq2SeqLM.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use humarin/chatgpt_paraphraser_on_T5_base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "humarin/chatgpt_paraphraser_on_T5_base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "humarin/chatgpt_paraphraser_on_T5_base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/humarin/chatgpt_paraphraser_on_T5_base
- SGLang
How to use humarin/chatgpt_paraphraser_on_T5_base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "humarin/chatgpt_paraphraser_on_T5_base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "humarin/chatgpt_paraphraser_on_T5_base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "humarin/chatgpt_paraphraser_on_T5_base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "humarin/chatgpt_paraphraser_on_T5_base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use humarin/chatgpt_paraphraser_on_T5_base with Docker Model Runner:
docker model run hf.co/humarin/chatgpt_paraphraser_on_T5_base
Fixing the error facing in google colab
When using (the device) in an argument getting this error in google colab
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
when used .to('cpu'), it runs smoothly and gets the desired result.
Fix this issue
Change this code to and it will work setting the same device when calling model for inference
input_ids = tokenizer(
f'paraphrase: {question}',
return_tensors="pt", padding="longest",
max_length=max_length,
truncation=True,
).input_ids
TO
input_ids = tokenizer(
f'paraphrase: {question}',
return_tensors="pt", padding="longest",
max_length=max_length,
truncation=True,
).input_ids.to(device)
NOTE: to(device)