Instructions to use agentlans/Llama3-zhcn with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use agentlans/Llama3-zhcn with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="agentlans/Llama3-zhcn") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("agentlans/Llama3-zhcn") model = AutoModelForCausalLM.from_pretrained("agentlans/Llama3-zhcn") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use agentlans/Llama3-zhcn with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "agentlans/Llama3-zhcn" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "agentlans/Llama3-zhcn", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/agentlans/Llama3-zhcn
- SGLang
How to use agentlans/Llama3-zhcn with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "agentlans/Llama3-zhcn" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "agentlans/Llama3-zhcn", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "agentlans/Llama3-zhcn" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "agentlans/Llama3-zhcn", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use agentlans/Llama3-zhcn with Docker Model Runner:
docker model run hf.co/agentlans/Llama3-zhcn
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("agentlans/Llama3-zhcn")
model = AutoModelForCausalLM.from_pretrained("agentlans/Llama3-zhcn")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))Llama3-zhcn
English
A Merged Llama 3 Model for Enhanced Chinese Understanding
This model is a merge of several pre-trained Llama 3 8B language models specifically focused on Simplified Chinese (China), created using mergekit.
Purpose
Llama3-zhcn aims to deliver a Llama 3 model with deep Chinese cultural, historical, and linguistic comprehension. This model serves dual purposes: handling diverse everyday tasks and providing a solid foundation for additional merging and fine-tuning. As Chinese model development trends away from the Llama series, this merge strives to maintain and improve Llama 3's Chinese language capabilities
Limitations
- No Vision Capabilities: This model is based on Llama 3 and does not include vision capabilities found in some later models like 3.1 and 3.2.
- Historical Accuracy: While the model possesses a good understanding of Chinese history, fact-checking is still recommended to ensure accuracy.
- Translation and Revision: The model is capable of performing translations and revisions in English and Chinese. However, optimal results may require prompt engineering.
Merge Details
Merge Method
This model was created using the Linear merge method, utilizing the Meta-Llama-3-8B-Instruct tokenizer.
Models Merged
The following models were included in the merge:
Chinese
增强中文理解的合并Llama 3模型
该模型是针对简体中文(中国)的多个预训练了8B语言模式进行融合,使用mergekit创建。
目标:
Llama3-zhcn旨在提供具有深入的中华文化、历史和语法理解能力。该模型有两个目的:处理各种日常任务,并为进一步组合或微调奠定坚实基础。在中国语言模式发展趋势从Llama系列转向时,这个融合尝试维护并改进Llama 3的中文语言功能。
限制:
- 无视觉能力:该模型基于Llama 3,不包括一些后续版本如3.1和3.2中具有的一些图像处理能力。
- 历史准确性:虽然这个模型对中国有良好的理解,但仍然建议进行事实核查以保证精度。
- 翻译与修订:该模型可以在英语和中文之间执行翻译和修改。然而,最佳结果可能需要引导工程。
合并细节:
组合方法
使用线性(Linear)组合法创建此模型,并利用Meta-Llama-3-8B-Instruct分词器进行处理。
被融入的模型
以下是被包含在内的模型:
- Downloads last month
- 1
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="agentlans/Llama3-zhcn") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)