Text Generation
PEFT
Safetensors
Transformers
lora
sft
trl
unsloth
text-generation-inference
conversational
Instructions to use meomeo163/medical_chatbot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use meomeo163/medical_chatbot with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("arcee-ai/Arcee-VyLinh") model = PeftModel.from_pretrained(base_model, "meomeo163/medical_chatbot") - Transformers
How to use meomeo163/medical_chatbot with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="meomeo163/medical_chatbot") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("meomeo163/medical_chatbot", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use meomeo163/medical_chatbot with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "meomeo163/medical_chatbot" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meomeo163/medical_chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/meomeo163/medical_chatbot
- SGLang
How to use meomeo163/medical_chatbot with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "meomeo163/medical_chatbot" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meomeo163/medical_chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "meomeo163/medical_chatbot" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meomeo163/medical_chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use meomeo163/medical_chatbot with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for meomeo163/medical_chatbot to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for meomeo163/medical_chatbot to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for meomeo163/medical_chatbot to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="meomeo163/medical_chatbot", max_seq_length=2048, ) - Docker Model Runner
How to use meomeo163/medical_chatbot with Docker Model Runner:
docker model run hf.co/meomeo163/medical_chatbot
How to use from
SGLangUse Docker images
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "meomeo163/medical_chatbot" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "meomeo163/medical_chatbot",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'Quick Links
Model Details
Model Description
- Model type: Text generation
- Language(s) (NLP): Vietnamese
- Finetuned from model: arcee-ai/Arcee-VyLinh
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import TextStreamer
from peft import PeftModel
import torch
# Load the model and tokenizer
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
base_model = AutoModelForCausalLM.from_pretrained("arcee-ai/Arcee-VyLinh")
tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Arcee-VyLinh")
adapter_model = PeftModel.from_pretrained(base_model, "meomeo163/medical_chatbot").to(device)
prompt = "mình đang có hiện tượng bị đau bụng dưới, thì thoảng thấy buồn nôn, các chứng bệnh mình có thể gặp phải là gì"
messages = [
{"role": "system", "content": "Bạn là trợ lý y tế chuyên nghiệp"},
{"role_type": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
streamer = TextStreamer(
tokenizer,
skip_prompt=True,
skip_special_tokens=True
)
generated_ids = adapter_model.generate(
model_inputs.input_ids,
attention_mask=model_inputs.attention_mask, # Add attention mask
pad_token_id=tokenizer.eos_token_id, # Set pad token id
max_new_tokens=512,
eos_token_id=tokenizer.eos_token_id,
temperature=0.25,
streamer=streamer,
no_repeat_ngram_size=3 # Add or increase no_repeat_ngram_size
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids)[0]
print(response)
Training Details
Training Data
A question-asnwer data about medical [hungnm/vietnamese-medical-qa]
Training Hyperparameters
Training regime:
training_args = TrainingArguments(
per_device_train_batch_size = 4,
gradient_accumulation_steps = 4,
warmup_steps = 100,
num_train_epochs = 3,
learning_rate = 2e-4,
bf16 = True,
logging_steps = 25,
output_dir = "finetuned_medical_qa_full",
optim = "adamw_8bit",
eval_strategy = "steps",
eval_steps = 100,
save_strategy = "steps",
save_steps = 100,
load_best_model_at_end = True,
metric_for_best_model = "eval_loss",
greater_is_better = False,
save_total_limit = 2,
)
- PEFT 0.17.1
- Downloads last month
- 1
Model tree for meomeo163/medical_chatbot
Base model
arcee-ai/Arcee-VyLinh
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "meomeo163/medical_chatbot" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meomeo163/medical_chatbot", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'