Instructions to use humarin/chatgpt_paraphraser_on_T5_base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use humarin/chatgpt_paraphraser_on_T5_base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="humarin/chatgpt_paraphraser_on_T5_base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base")
model = AutoModelForSeq2SeqLM.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use humarin/chatgpt_paraphraser_on_T5_base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "humarin/chatgpt_paraphraser_on_T5_base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "humarin/chatgpt_paraphraser_on_T5_base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/humarin/chatgpt_paraphraser_on_T5_base

SGLang

How to use humarin/chatgpt_paraphraser_on_T5_base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "humarin/chatgpt_paraphraser_on_T5_base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "humarin/chatgpt_paraphraser_on_T5_base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "humarin/chatgpt_paraphraser_on_T5_base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "humarin/chatgpt_paraphraser_on_T5_base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use humarin/chatgpt_paraphraser_on_T5_base with Docker Model Runner:
```
docker model run hf.co/humarin/chatgpt_paraphraser_on_T5_base
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

This model was trained on our ChatGPT paraphrase dataset.

This dataset is based on the Quora paraphrase question, texts from the SQUAD 2.0 and the CNN news dataset.

This model is based on the T5-base model. We used "transfer learning" to get our model to generate paraphrases as well as ChatGPT. Now we can say that this is one of the best paraphrases of the Hugging Face.

Kaggle link

Author's 1 LinkedIn link Author's 2 LinkedIn link

Deploying example

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

device = "cuda"

tokenizer = AutoTokenizer.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base")

model = AutoModelForSeq2SeqLM.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base").to(device)

def paraphrase(
    question,
    num_beams=5,
    num_beam_groups=5,
    num_return_sequences=5,
    repetition_penalty=10.0,
    diversity_penalty=3.0,
    no_repeat_ngram_size=2,
    temperature=0.7,
    max_length=128
):
    input_ids = tokenizer(
        f'paraphrase: {question}',
        return_tensors="pt", padding="longest",
        max_length=max_length,
        truncation=True,
    ).input_ids.to(device)
    
    outputs = model.generate(
        input_ids, temperature=temperature, repetition_penalty=repetition_penalty,
        num_return_sequences=num_return_sequences, no_repeat_ngram_size=no_repeat_ngram_size,
        num_beams=num_beams, num_beam_groups=num_beam_groups,
        max_length=max_length, diversity_penalty=diversity_penalty
    )

    res = tokenizer.batch_decode(outputs, skip_special_tokens=True)

    return res

Usage examples

Input:

text = 'What are the best places to see in New York?'
paraphrase(text)

Output:

['What are some must-see places in New York?',
 'Can you suggest some must-see spots in New York?',
 'Where should one go to experience the best NYC has to offer?',
 'Which places should I visit in New York?',
 'What are the top destinations to explore in New York?']

Input:

text = "Rammstein's album Mutter was recorded in the south of France in May and June 2000, and mixed in Stockholm in October of that year."
paraphrase(text)

Output:

['In May and June 2000, Rammstein travelled to the south of France to record his album Mutter, which was mixed in Stockholm in October of that year.',
 'The album Mutter by Rammstein was recorded in the south of France during May and June 2000, with mixing taking place in Stockholm in October of that year.',
 'The album Mutter by Rammstein was recorded in the south of France during May and June 2000, with mixing taking place in Stockholm in October of that year. It',
 'Mutter, the album released by Rammstein, was recorded in southern France during May and June 2000, with mixing taking place between October and September.',
 'In May and June 2000, Rammstein recorded his album Mutter in the south of France, with the mix being made at Stockholm during October.']

Train parameters

epochs = 5
batch_size = 64
max_length = 128
lr = 5e-5
batches_qty = 196465
betas = (0.9, 0.999)
eps = 1e-08

BibTeX entry and citation info

@inproceedings{chatgpt_paraphraser,
  author={Vladimir Vorobev, Maxim Kuznetsov},
  title={A paraphrasing model based on ChatGPT paraphrases},
  year={2023}
}

Downloads last month: 13,977

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for humarin/chatgpt_paraphraser_on_T5_base

Base model

google-t5/t5-base

Finetuned

(736)

this model

Adapters

3 models

Finetunes

6 models

Quantizations

2 models

humarin
/

chatgpt_paraphraser_on_T5_base

Deploying example

Usage examples

Train parameters

BibTeX entry and citation info

Model tree for humarin/chatgpt_paraphraser_on_T5_base

Dataset used to train humarin/chatgpt_paraphraser_on_T5_base

Spaces using humarin/chatgpt_paraphraser_on_T5_base 52