Text Generation
Transformers
TensorBoard
ONNX
Safetensors
English
llama
alignment-handbook
trl
sft
conversational
text-generation-inference
Instructions to use HuggingFaceTB/smollm-360M-instruct-add-basics with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HuggingFaceTB/smollm-360M-instruct-add-basics with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HuggingFaceTB/smollm-360M-instruct-add-basics") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/smollm-360M-instruct-add-basics") model = AutoModelForMultimodalLM.from_pretrained("HuggingFaceTB/smollm-360M-instruct-add-basics") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use HuggingFaceTB/smollm-360M-instruct-add-basics with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HuggingFaceTB/smollm-360M-instruct-add-basics" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/smollm-360M-instruct-add-basics", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/HuggingFaceTB/smollm-360M-instruct-add-basics
- SGLang
How to use HuggingFaceTB/smollm-360M-instruct-add-basics with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HuggingFaceTB/smollm-360M-instruct-add-basics" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/smollm-360M-instruct-add-basics", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HuggingFaceTB/smollm-360M-instruct-add-basics" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceTB/smollm-360M-instruct-add-basics", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use HuggingFaceTB/smollm-360M-instruct-add-basics with Docker Model Runner:
docker model run hf.co/HuggingFaceTB/smollm-360M-instruct-add-basics
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| BASE_PATH = "/fsx/loubna/projects/alignment-handbook/recipes/cosmo2/sft/data" | |
| TEMPERATURE = 0.2 | |
| TOP_P = 0.9 | |
| CHECKPOINT = "HuggingFaceTB/smollm-350M-instruct-add-basics" | |
| print(f"💾 Loading the model and tokenizer: {CHECKPOINT}...") | |
| device = "cuda" | |
| tokenizer = AutoTokenizer.from_pretrained(CHECKPOINT) | |
| model_s = AutoModelForCausalLM.from_pretrained(CHECKPOINT).to(device) | |
| print("🧪 Testing single-turn conversations...") | |
| L = [ | |
| # Witing and general knowledge prompts | |
| "Discuss the ethical implications of using AI in hiring processes.", | |
| "Give me some tips to improve my time management skills?", | |
| "Write a short dialogue between a customer and a waiter at a restaurant.", | |
| "wassup?", | |
| "Tell me a joke", | |
| "Hi, what are some popular dishes from Japan?", | |
| "What is the capital of Switzerland?", | |
| "What is the capital of France?", | |
| "What's the capital of Portugal?", | |
| "What is the capital of Morocco?", | |
| "How do I make pancakes?", | |
| "Write a poem about Helium", | |
| "Do you think it's important for a company to have a strong company culture? Why or why not?", | |
| "What is your favorite book?", | |
| "What is the most interesting fact you know?", | |
| "What is your favorite movie?", | |
| # Science prompts | |
| "Can you tell me what is gravity?", | |
| "Who discovered gravity?", | |
| "How does a rainbow form?", | |
| "What are the three states of matter?", | |
| "Why is the sky blue?", | |
| "What is the water cycle?", | |
| "How do magnets work?", | |
| "What is buoyancy?", | |
| "What is the speed of light?", | |
| "What's 2+2?", | |
| "what's the sum of 2 and 2?", | |
| "what's the sum of 2 and 3?", | |
| "What is the term for the process by which plants make their own food?", | |
| "If you have 8 apples and you give away 3, how many apples do you have left?", | |
| # Python prompts | |
| "How do I define a function in Python?", | |
| "Can you explain what a dictionary is in Python?", | |
| "Write a sort alrogithm in Python", | |
| "Write a fibonacci sequence in Python", | |
| "How do I read a file in Python?", | |
| "How do I make everything uppercase in Python?", | |
| "implement bubble sort in Python", | |
| # Creative prompts | |
| "Write a short story about a time traveler", | |
| "Describe a futuristic city in three sentences", | |
| "Describe a new color that doesn't exist", | |
| "Create a slogan for a time machine company", | |
| "Describe a world where plants can speak", | |
| ] | |
| for i in range(len(L)): | |
| print(f"🔮 {L[i]}") | |
| messages = [{"role": "user", "content": L[i]}] | |
| input_text = tokenizer.apply_chat_template(messages, tokenize=False) | |
| inputs = tokenizer.encode(input_text, return_tensors="pt").to(device) | |
| outputs = model_s.generate( | |
| inputs, max_new_tokens=200, top_p=TOP_P, do_sample=True, temperature=TEMPERATURE | |
| ) | |
| with open( | |
| f"{BASE_PATH}/{CHECKPOINT.split('/')[-1]}_temp_{TEMPERATURE}_topp{TOP_P}.txt", | |
| "a", | |
| ) as f: | |
| f.write("=" * 50 + "\n") | |
| f.write(tokenizer.decode(outputs[0])) | |
| f.write("\n") | |