HelpingAI Series
Collection
Our Emotionally intelligent Models • 7 items • Updated • 2
How to use HelpingAI/hai3.1-checkpoint-0001 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="HelpingAI/hai3.1-checkpoint-0001", trust_remote_code=True)
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("HelpingAI/hai3.1-checkpoint-0001", trust_remote_code=True, dtype="auto")How to use HelpingAI/hai3.1-checkpoint-0001 with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "HelpingAI/hai3.1-checkpoint-0001"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "HelpingAI/hai3.1-checkpoint-0001",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/HelpingAI/hai3.1-checkpoint-0001
How to use HelpingAI/hai3.1-checkpoint-0001 with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "HelpingAI/hai3.1-checkpoint-0001" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "HelpingAI/hai3.1-checkpoint-0001",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "HelpingAI/hai3.1-checkpoint-0001" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "HelpingAI/hai3.1-checkpoint-0001",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use HelpingAI/hai3.1-checkpoint-0001 with Docker Model Runner:
docker model run hf.co/HelpingAI/hai3.1-checkpoint-0001
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "HelpingAI/hai3.1-checkpoint-0001" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "HelpingAI/hai3.1-checkpoint-0001",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'CURRENTLY IN TRAINING :)
Currently, only the LLM section of this model is fully ready.
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch
# Load model and tokenizer
model_name = "Abhaykoul/hai3.1-pretrainedv3"
# Set device to CUDA if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype="auto")
model.to(device)
print(model)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
# Message role format for chat
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": """hlo"""},
]
# Apply chat template to format prompt
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Tokenize input and move to device
inputs = tokenizer(prompt, return_tensors="pt")
inputs = {k: v.to(device) for k, v in inputs.items()}
# Set up text streamer for live output
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
# Generate text with streaming
model.generate(
**inputs,
max_new_tokens=4089,
temperature=0.7,
top_p=0.9,
do_sample=True,
streamer=streamer
)
Classfication section undertraining
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
ckpt = "Abhaykoul/hai3.1-pretrainedv3"
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained(ckpt, trust_remote_code=True).to(device).eval()
tok = AutoTokenizer.from_pretrained(ckpt, trust_remote_code=True)
if tok.pad_token is None:
tok.pad_token = tok.eos_token
text = "I am thrilled about my new job!"
enc = tok([text], padding=True, truncation=True, max_length=2048, return_tensors="pt")
enc = {k: v.to(device) for k, v in enc.items()}
with torch.no_grad():
out = model(input_ids=enc["input_ids"], attention_mask=enc.get("attention_mask"), output_hidden_states=True, return_dict=True, use_cache=False)
last = out.hidden_states[-1]
idx = (enc["attention_mask"].sum(dim=1) - 1).clamp(min=0)
pooled = last[torch.arange(last.size(0)), idx]
logits = model.structured_lm_head(pooled)
pred_id = logits.argmax(dim=-1).item()
print("Predicted class id:", pred_id)
# Map id -> label using your dataset’s label list, e.g.:
id2label = ["sadness","joy","love","anger","fear","surprise"] # dair-ai/emotion
print("Predicted label:", id2label[pred_id] if pred_id < len(id2label) else "unknown")
TTS layers in training
NOTE: we have used qwen2 tokenizer in it
This model contains layers from our diffrent models To aline layers we have done post training after merging layers
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HelpingAI/hai3.1-checkpoint-0001" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HelpingAI/hai3.1-checkpoint-0001", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'