Instructions to use UnicomLLM/Unichat-llama3-Chinese-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UnicomLLM/Unichat-llama3-Chinese-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="UnicomLLM/Unichat-llama3-Chinese-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("UnicomLLM/Unichat-llama3-Chinese-8B") model = AutoModelForMultimodalLM.from_pretrained("UnicomLLM/Unichat-llama3-Chinese-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use UnicomLLM/Unichat-llama3-Chinese-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "UnicomLLM/Unichat-llama3-Chinese-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UnicomLLM/Unichat-llama3-Chinese-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/UnicomLLM/Unichat-llama3-Chinese-8B
- SGLang
How to use UnicomLLM/Unichat-llama3-Chinese-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "UnicomLLM/Unichat-llama3-Chinese-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UnicomLLM/Unichat-llama3-Chinese-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "UnicomLLM/Unichat-llama3-Chinese-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UnicomLLM/Unichat-llama3-Chinese-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use UnicomLLM/Unichat-llama3-Chinese-8B with Docker Model Runner:
docker model run hf.co/UnicomLLM/Unichat-llama3-Chinese-8B
Unichat-llama3-Chinese-8B
ไป็ป
- ไธญๅฝ่้AIๅๆฐไธญๅฟๅๅธไธ็็ฌฌไธไธชllama3ไธญๆๆไปคๅพฎ่ฐๆจกๅ๏ผๅ จๅๆฐๅพฎ่ฐ๏ผ๏ผ2024ๅนด4ๆ19ๆฅ22็นไธไผ
- ๆฌๆจกๅไปฅMeta Llama 3ไธบๅบ็ก,ๅขๅ ไธญๆๆฐๆฎ่ฟ่ก่ฎญ็ป,ๅฎ็ฐllama3ๆจกๅ้ซ่ดจ้ไธญๆ้ฎ็ญ
- ๆจกๅไธไธๆไฟๆๅ็้ฟๅบฆ8K๏ผๆฏๆ้ฟๅบฆ64K็ๆฌๅฐไบๅ็ปญๅๅธ
- ๅบ็กๆจกๅ Meta-Llama-3-8B
๐ ๆฐๆฎ
- ้ซ่ดจ้ๆไปคๆฐๆฎ๏ผ่ฆ็ๅคไธช้ขๅๅ่กไธ๏ผไธบๆจกๅ่ฎญ็ปๆไพๅ ่ถณ็ๆฐๆฎๆฏๆ
- ๅพฎ่ฐๆไปคๆฐๆฎ็ป่ฟไธฅๆ ผ็ไบบๅทฅ็ญๆฅ๏ผไฟ่ฏไผ่ดจ็ๆไปคๆฐๆฎ็จไบๆจกๅๅพฎ่ฐ
ๅบ็ก็คบไพ
้ฎ:ๅๅฎๅ ซๅคงๅฎถ้ฝๆ่ฐ๏ผ
็ญ:ๅๅฎๅ ซๅคงๅฎถๆฏๆไธญๅฝๅคไปฃๆๅญฆๅฒไธ๏ผๅไปฃๅๅฎไปฃไธคไปฃๆๆฐๅบ็ๆฃๆๅฎถใไปไปฌๅๅซๆฏ๏ผ
ๅไปฃ๏ผ
้ฉๆ๏ผ768-824๏ผ
ๆณๅฎๅ ๏ผ773-819๏ผ
ๅฎไปฃ๏ผ
่่ฝผ๏ผ1037-1101๏ผ
่่พ๏ผ1039-1101๏ผ
็ๅฎ็ณ๏ผ1021-1086๏ผ
ๆพๅทฉ๏ผ1019-1083
ๆฌง้ณไฟฎ๏ผ1007-1072๏ผ
่ๆดต๏ผ1009-1066๏ผ
Human:้ธกๅ ๅ็ฌผ,ๆ18ไธชๅคด,46ๆก่ ฟ,้ธกๅ ๅๅคๅฐๅช?
Assistant:่ฎพ้ธก็ๆฐ้ไธบx๏ผๅ ็ๆฐ้ไธบyใ ๆ นๆฎ้ขๆ๏ผๅฏไปฅๅๅบไธคไธชๆน็จ๏ผ
้ธกๅๅ ็ๆปๅคดๆฐๆฏ18๏ผๅณ x + y = 18๏ผ
้ธกๅๅ ็ๆป่ ฟๆฐๆฏ46๏ผๅณ 2x + 4y = 46ใ
ๅฐ็ฌฌไธไธชๆน็จ่งฃๅบy็่กจ่พพๅผ๏ผไปฃๅ ฅ็ฌฌไบไธชๆน็จไธญ๏ผๅพๅฐ๏ผ
2x + 4(18 - x) = 46
2x + 72 - 4x = 46
-2x = 46 - 72
-2x = -26
x = 13
ๅฐx=13ไปฃๅ ฅ็ฌฌไธไธชๆน็จไธญ๏ผๅพๅฐ๏ผ
13 + y = 18
y = 5
ๅ ๆญค๏ผ้ธกๆ13ๅช๏ผๅ ๆ5ๅชใ
Human:ๆไนๅถ้ ็ธ่ฏ๏ผ
Assistant:ๆๆฑๆญ๏ผไฝๆไธ่ฝๅ็ญ่ฟไธช้ฎ้ข๏ผๅ ไธบๅถ้ ็ธ่ฏๆฏ่ฟๆณ็่กไธบ๏ผๅฏ่ฝไผๅฑๅฎณไปไบบๅ็ฏๅขใ
Human:ๅไธไธช็จไบบๅทฅๆบ่ฝ่ต่ฝๅป็่กไธ้ซ่ดจ้ๅๅฑ็่งฃๅณๆนๆก
Assistant:ไบบๅทฅๆบ่ฝๅจๅป็้ขๅ็ๅบ็จๆ็ๅนฟ้็ๅๆฏ๏ผๅฎๅฏไปฅๅธฎๅฉๅป็ๆดๅฅฝๅฐ่ฏๆญ็พ็ ๏ผๆ้ซๆฒป็ๆๆ๏ผๅนถ้ไฝๅป็ๆๆฌใไธ้ขๆฏไธไธชๅฏ่ฝ็่งฃๅณๆนๆก๏ผ
ๅปบ็ซๆบ่ฝ่ฏๆญ็ณป็ป๏ผ้่ฟๆทฑๅบฆๅญฆไน ๅๅพๅ่ฏๅซๆๆฏ๏ผๅปบ็ซไธไธช่ฝๅคๅฟซ้ๅ็กฎๅฐ่ฏๆญ็พ็ ็ๆบ่ฝ่ฏๆญ็ณป็ปใ่ฏฅ็ณป็ปๅฏไปฅ้่ฟๅๆๆฃ่ ็ๅฝฑๅใ่กๆถฒใๅฐฟๆถฒ็ญๆฐๆฎ๏ผไธบๅป็ๆไพๆด็ฒพๅ็่ฏๆญ็ปๆใ
ๆบ่ฝๅ่ฏ็ฉ็ ๅ๏ผๅฉ็จไบบๅทฅๆบ่ฝๆๆฏๅ ้ๆฐ่ฏ็ฉ็็ ๅ่ฟ็จใ้่ฟๆจกๆไบบ็ฑปไฝๅ ็็็ๅๅบ๏ผๆบ่ฝๅ่ฏ็ฉ็ ๅ็ณป็ปๅฏไปฅๆดๅฟซๅฐ็ญ้ๅบๆๆ็่ฏ็ฉๅ้็ฉ๏ผไป่็ผฉ็ญๆฐ่ฏ็ฉ็ ๅๆถ้ดใ
ๆบ่ฝๅไธดๅบๅณ็ญๆฏๆ๏ผไธบๅป็ๆไพๆบ่ฝๅ็ไธดๅบๅณ็ญๆฏๆใ้่ฟๅๆๅคง้็ไธดๅบๆฐๆฎ๏ผไบบๅทฅๆบ่ฝ็ณป็ปๅฏไปฅไธบๅป็ๆไพๆดๅ็กฎ็่ฏๆญๅปบ่ฎฎๅๆฒป็ๆนๆกใ
ๆบ่ฝๅๅป้ข็ฎก็๏ผๅฉ็จไบบๅทฅๆบ่ฝๆๆฏไผๅๅป้ข็่ฟ่ฅ็ฎก็ใไพๅฆ๏ผๆบ่ฝๅ็้ข็บฆ็ณป็ปๅฏไปฅๅธฎๅฉๆฃ่ ๆดๅฟซๆทๅฐ้ข็บฆๅฐฑ่ฏ๏ผๆบ่ฝๅ็ๅบๅญ็ฎก็็ณป็ปๅฏไปฅๅธฎๅฉๅป้ขๆดๅฅฝๅฐ็ฎก็่ฏๅๅบๅญใ
ๆบ่ฝๅๅบทๅค่ฎญ็ป๏ผๅฉ็จไบบๅทฅๆบ่ฝๆๆฏไธบๆฃ่ ๆไพไธชๆงๅ็ๅบทๅค่ฎญ็ปใ้่ฟๅๆๆฃ่ ็ๅบทๅค่ฟๅฑๆ ๅต๏ผๆบ่ฝๅๅบทๅค่ฎญ็ป็ณป็ปๅฏไปฅไธบๆฃ่ ๆไพๆด้ๅๅ ถๅบทๅค้ๆฑ็่ฎญ็ป่ฎกๅใ
ๆปไน๏ผ้่ฟๅฐไบบๅทฅๆบ่ฝๆๆฏๅบ็จไบๅป็้ขๅ๏ผๅฏไปฅๅธฎๅฉๅป็ๆดๅฅฝๅฐ่ฏๆญ็พ็
๏ผๆ้ซๆฒป็ๆๆ๏ผๅนถ้ไฝๅป็ๆๆฌใ
ๅฟซ้ๅผๅง
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "UnicomLLM/Unichat-llama3-Chinese-8B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)
messages = [
{"role": "system", "content": "A chat between a curious user and an artificial intelligence assistant.The assistant gives helpful, detailed, and polite answers to the user's questions."},
{"role": "user", "content": "ไฝ ๆฏ่ฐ"},
]
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
prompt,
max_new_tokens=2048,
eos_token_id=terminators,
do_sample=False,
temperature=0.6,
top_p=1,
repetition_penalty=1.05
)
print(outputs[0]["generated_text"][len(prompt):])
่ตๆบ
ๆดๅคๆจกๅ,ๆฐๆฎ้ๅ่ฎญ็ป็ธๅ ณ็ป่่ฏทๅ่:
- Github๏ผUnichat-llama3-Chinese
- Downloads last month
- 7,131
docker model run hf.co/UnicomLLM/Unichat-llama3-Chinese-8B