Instructions to use Jumtra/rinna-3.6b-tune-ep5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Jumtra/rinna-3.6b-tune-ep5 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Jumtra/rinna-3.6b-tune-ep5")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("Jumtra/rinna-3.6b-tune-ep5") model = AutoModelForMultimodalLM.from_pretrained("Jumtra/rinna-3.6b-tune-ep5") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Jumtra/rinna-3.6b-tune-ep5 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Jumtra/rinna-3.6b-tune-ep5" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jumtra/rinna-3.6b-tune-ep5", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Jumtra/rinna-3.6b-tune-ep5
- SGLang
How to use Jumtra/rinna-3.6b-tune-ep5 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Jumtra/rinna-3.6b-tune-ep5" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jumtra/rinna-3.6b-tune-ep5", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Jumtra/rinna-3.6b-tune-ep5" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jumtra/rinna-3.6b-tune-ep5", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Jumtra/rinna-3.6b-tune-ep5 with Docker Model Runner:
docker model run hf.co/Jumtra/rinna-3.6b-tune-ep5
File size: 1,964 Bytes
93070a2 2b57851 93070a2 dbdfcda | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | ---
license: mit
tags:
- ja
- gpt_neox
- text-generation
- lm
- nlp
datasets:
- kunishou/databricks-dolly-15k-ja
- kunishou/hh-rlhf-49k-ja
- kunishou/cnn-dailymail-27k-ja
- Jumtra/oasst1_ja
- Jumtra/jglue_jnli
- Jumtra/jglue_jsquad
- Jumtra/jglue_jsquads_with_input
inference: false
language:
- ja
---
# rinna-3.6b
このモデルは、MosaicMLのllm-foundryリポジトリを使用して[rinna/japanese-gpt-neox-3.6b](https://huggingface.co/rinna/japanese-gpt-neox-3.6b)をファインチューニングしたモデルです。
## Model Date
June 28, 2023
## Model License
MIT
## 評価
[Jumtra/test_data_100QA](https://huggingface.co/datasets/Jumtra/test_data_100QA)を用いてモデルの正答率を評価した
また、学習時のvalidateデータに対してのPerplexityを記載した。
| model name | 正答率 | Perplexity |
| ---- | ---- | ---- |
| [Jumtra/rinna-3.6b-tune-ep5](https://huggingface.co/Jumtra/rinna-3.6b-tune-ep5)| 40/100 | 8.105 |
| [Jumtra/rinna-v1-tune-ep1](https://huggingface.co/Jumtra/rinna-v1-tune-ep1) | 42/100 | 7.458 |
| [Jumtra/rinna-v1-tune-ep3](https://huggingface.co/Jumtra/rinna-v1-tune-ep3) | 41/100 | 7.034 |
| [Jumtra/calm-7b-tune-ep4](https://huggingface.co/Jumtra/calm-7b-tune-ep4) | 40/100 | 9.766 |
| [Jumtra/calm-v3-ep1](https://huggingface.co/Jumtra/calm-v3-ep1) | 35/100 | 9.305 |
| [Jumtra/calm-v3-ep3](https://huggingface.co/Jumtra/calm-v3-ep3) | 37/100 | 13.276 |
以下のプロンプトを用いた
```python
INSTRUCTION_KEY = "### 入力:"
RESPONSE_KEY = "### 回答:"
INTRO_BLURB = "以下はタスクを説明する指示と文脈のある文章が含まれた入力です。要求を適切に満たす回答を生成しなさい。"
JP_PROMPT_FOR_GENERATION_FORMAT = """{intro}
{instruction_key}
{instruction}
{response_key}
""".format(
intro=INTRO_BLURB,
instruction_key=INSTRUCTION_KEY,
instruction="{instruction}",
response_key=RESPONSE_KEY,
)
```
|