How to use from
SGLang
Install from pip and serve model
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "HelpingAI/hai3.1-checkpoint-0001" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HelpingAI/hai3.1-checkpoint-0001",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker images
docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "HelpingAI/hai3.1-checkpoint-0001" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HelpingAI/hai3.1-checkpoint-0001",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Quick Links

CURRENTLY IN TRAINING :)

Currently, only the LLM section of this model is fully ready.

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch

# Load model and tokenizer
model_name = "Abhaykoul/hai3.1-pretrainedv3"

# Set device to CUDA if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True, torch_dtype="auto")
model.to(device)
print(model)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Message role format for chat
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": """hlo"""},
]

# Apply chat template to format prompt
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Tokenize input and move to device
inputs = tokenizer(prompt, return_tensors="pt")
inputs = {k: v.to(device) for k, v in inputs.items()}

# Set up text streamer for live output
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

# Generate text with streaming
model.generate(
    **inputs,
    max_new_tokens=4089,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    streamer=streamer
)

Classfication section undertraining

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

ckpt = "Abhaykoul/hai3.1-pretrainedv3"
device = "cuda" if torch.cuda.is_available() else "cpu"

model = AutoModelForCausalLM.from_pretrained(ckpt, trust_remote_code=True).to(device).eval()
tok = AutoTokenizer.from_pretrained(ckpt, trust_remote_code=True)
if tok.pad_token is None:
    tok.pad_token = tok.eos_token

text = "I am thrilled about my new job!"
enc = tok([text], padding=True, truncation=True, max_length=2048, return_tensors="pt")
enc = {k: v.to(device) for k, v in enc.items()}

with torch.no_grad():
    out = model(input_ids=enc["input_ids"], attention_mask=enc.get("attention_mask"), output_hidden_states=True, return_dict=True, use_cache=False)
    last = out.hidden_states[-1]
    idx = (enc["attention_mask"].sum(dim=1) - 1).clamp(min=0)
    pooled = last[torch.arange(last.size(0)), idx]
    logits = model.structured_lm_head(pooled)
    pred_id = logits.argmax(dim=-1).item()

print("Predicted class id:", pred_id)
# Map id -> label using your dataset’s label list, e.g.:
id2label = ["sadness","joy","love","anger","fear","surprise"]  # dair-ai/emotion
print("Predicted label:", id2label[pred_id] if pred_id < len(id2label) else "unknown")

TTS layers in training

NOTE: we have used qwen2 tokenizer in it

This model contains layers from our diffrent models To aline layers we have done post training after merging layers

Downloads last month
15
Safetensors
Model size
16B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including HelpingAI/hai3.1-checkpoint-0001