Instructions to use Jumtra/rinna-3.6b-tune-ep5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Jumtra/rinna-3.6b-tune-ep5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Jumtra/rinna-3.6b-tune-ep5")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Jumtra/rinna-3.6b-tune-ep5")
model = AutoModelForMultimodalLM.from_pretrained("Jumtra/rinna-3.6b-tune-ep5")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Jumtra/rinna-3.6b-tune-ep5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Jumtra/rinna-3.6b-tune-ep5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jumtra/rinna-3.6b-tune-ep5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Jumtra/rinna-3.6b-tune-ep5

SGLang

How to use Jumtra/rinna-3.6b-tune-ep5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Jumtra/rinna-3.6b-tune-ep5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jumtra/rinna-3.6b-tune-ep5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Jumtra/rinna-3.6b-tune-ep5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Jumtra/rinna-3.6b-tune-ep5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Jumtra/rinna-3.6b-tune-ep5 with Docker Model Runner:
```
docker model run hf.co/Jumtra/rinna-3.6b-tune-ep5
```

rinna-3.6b-tune-ep5 / README.md

Jumtra

Update README.md

2b57851 almost 3 years ago

preview code

Raw

History Blame

1.96 kB

	---
	license: mit
	tags:
	- ja
	- gpt_neox
	- text-generation
	- lm
	- nlp
	datasets:
	- kunishou/databricks-dolly-15k-ja
	- kunishou/hh-rlhf-49k-ja
	- kunishou/cnn-dailymail-27k-ja
	- Jumtra/oasst1_ja
	- Jumtra/jglue_jnli
	- Jumtra/jglue_jsquad
	- Jumtra/jglue_jsquads_with_input
	inference: false
	language:
	- ja
	---

	# rinna-3.6b

	このモデルは、MosaicMLのllm-foundryリポジトリを使用して[rinna/japanese-gpt-neox-3.6b](https://huggingface.co/rinna/japanese-gpt-neox-3.6b)をファインチューニングしたモデルです。

	## Model Date

	June 28, 2023

	## Model License

	MIT


	## 評価

	[Jumtra/test_data_100QA](https://huggingface.co/datasets/Jumtra/test_data_100QA)を用いてモデルの正答率を評価した
	また、学習時のvalidateデータに対してのPerplexityを記載した。

	\| model name \| 正答率 \| Perplexity \|
	\| ---- \| ---- \| ---- \|
	\| [Jumtra/rinna-3.6b-tune-ep5](https://huggingface.co/Jumtra/rinna-3.6b-tune-ep5)\| 40/100 \| 8.105 \|
	\| [Jumtra/rinna-v1-tune-ep1](https://huggingface.co/Jumtra/rinna-v1-tune-ep1) \| 42/100 \| 7.458 \|
	\| [Jumtra/rinna-v1-tune-ep3](https://huggingface.co/Jumtra/rinna-v1-tune-ep3) \| 41/100 \| 7.034 \|
	\| [Jumtra/calm-7b-tune-ep4](https://huggingface.co/Jumtra/calm-7b-tune-ep4) \| 40/100 \| 9.766 \|
	\| [Jumtra/calm-v3-ep1](https://huggingface.co/Jumtra/calm-v3-ep1) \| 35/100 \| 9.305 \|
	\| [Jumtra/calm-v3-ep3](https://huggingface.co/Jumtra/calm-v3-ep3) \| 37/100 \| 13.276 \|

	以下のプロンプトを用いた
	```python
	INSTRUCTION_KEY = "### 入力:"
	RESPONSE_KEY = "### 回答:"
	INTRO_BLURB = "以下はタスクを説明する指示と文脈のある文章が含まれた入力です。要求を適切に満たす回答を生成しなさい。"
	JP_PROMPT_FOR_GENERATION_FORMAT = """{intro}
	{instruction_key}
	{instruction}
	{response_key}
	""".format(
	intro=INTRO_BLURB,
	instruction_key=INSTRUCTION_KEY,
	instruction="{instruction}",
	response_key=RESPONSE_KEY,
	)
	```