Instructions to use ai-nexuz/llama-3.2-1b-instruct-fine-tuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ai-nexuz/llama-3.2-1b-instruct-fine-tuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ai-nexuz/llama-3.2-1b-instruct-fine-tuned") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ai-nexuz/llama-3.2-1b-instruct-fine-tuned") model = AutoModelForCausalLM.from_pretrained("ai-nexuz/llama-3.2-1b-instruct-fine-tuned") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ai-nexuz/llama-3.2-1b-instruct-fine-tuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ai-nexuz/llama-3.2-1b-instruct-fine-tuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ai-nexuz/llama-3.2-1b-instruct-fine-tuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ai-nexuz/llama-3.2-1b-instruct-fine-tuned
- SGLang
How to use ai-nexuz/llama-3.2-1b-instruct-fine-tuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ai-nexuz/llama-3.2-1b-instruct-fine-tuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ai-nexuz/llama-3.2-1b-instruct-fine-tuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ai-nexuz/llama-3.2-1b-instruct-fine-tuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ai-nexuz/llama-3.2-1b-instruct-fine-tuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use ai-nexuz/llama-3.2-1b-instruct-fine-tuned with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ai-nexuz/llama-3.2-1b-instruct-fine-tuned to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ai-nexuz/llama-3.2-1b-instruct-fine-tuned to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ai-nexuz/llama-3.2-1b-instruct-fine-tuned to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="ai-nexuz/llama-3.2-1b-instruct-fine-tuned", max_seq_length=2048, ) - Docker Model Runner
How to use ai-nexuz/llama-3.2-1b-instruct-fine-tuned with Docker Model Runner:
docker model run hf.co/ai-nexuz/llama-3.2-1b-instruct-fine-tuned
LLaMA-3.2-1B-Instruct Fine-Tuned Model
Model Card for Hugging Face Repository
Model Summary
This is a fine-tuned version of the LLaMA-3.2-1B-Instruct model. Fine-tuned using the kanhatakeyama/wizardlm8x22b-logical-math-coding-sft dataset, this model is specialized in logical reasoning, mathematical problem-solving, and coding tasks. Training was performed using Unsloth on Google Colab, optimized for performance and usability.
Model Details
- Model Name: LLaMA-3.2-1B-Instruct (Fine-tuned)
- Base Model: LLaMA-3.2-1B-Instruct
- Fine-Tuning Dataset:
kanhatakeyama/wizardlm8x22b-logical-math-coding-sft - Fine-Tuning Framework: Unsloth
- Parameters: 1 Billion
- Domain: Logical Reasoning, Mathematics, Coding
- Tags:
llama,fine-tuning,instruction-following,math,coding,logical-reasoning,unsloth
Fine-Tuning Dataset
The fine-tuning dataset, kanhatakeyama/wizardlm8x22b-logical-math-coding-sft, is curated for advanced reasoning tasks. It contains:
- Logical reasoning scenarios
- Step-by-step mathematical solutions
- Complex code generation and debugging examples
Dataset Link: kanhatakeyama/wizardlm8x22b-logical-math-coding-sft
Intended Use
This model is ideal for tasks such as:
- Logical Problem Solving: Derive conclusions and explanations for logical questions.
- Mathematics: Solve algebra, calculus, and other mathematical problems.
- Coding: Generate, debug, and explain programming code in various languages.
- Instruction-Following: Handle user queries with clear and concise answers.
Example Applications:
- AI tutors
- Logical reasoning assistants
- Math-solving bots
- Code generation and debugging tools
Usage
Installation
To use this model, install the required dependencies:
pip install transformers datasets torch accelerate
Loading the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the fine-tuned model and tokenizer
model_name = "ai-nexuz/llama-3.2-1b-instruct-fine-tuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
Generating Outputs
prompt = "Solve this equation: 2x + 3 = 7. Find x."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Model Training
Hardware
- Platform: Google Colab Pro
- GPU: NVIDIA Tesla T4
Training Configuration
- Batch Size: 32
- Epochs: 1
Frameworks Used
- Unsloth: For efficient training
- Hugging Face Transformers: For model and tokenizer handling
Limitations
While this model is highly proficient in logical reasoning, mathematics, and coding tasks, there are some limitations:
- May produce inaccurate results for ambiguous or poorly-defined prompts.
- Performance may degrade for highly specialized or niche coding languages.
Deployment
Using Gradio for Web UI
pip install gradio
import gradio as gr
def generate_response(prompt):
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
gr.Interface(fn=generate_response, inputs="text", outputs="text").launch()
Hugging Face Inference API
This model can also be accessed using the Hugging Face Inference API for hosted deployment:
from transformers import pipeline
pipe = pipeline("text-generation", model="ai-nexuz/llama-3.2-1b-instruct-fine-tuned")
result = pipe("Explain the concept of recursion in programming.")
print(result)
Acknowledgements
This fine-tuning work was made possible by:
- Hugging Face for their exceptional library and dataset hosting.
- Unsloth for providing an efficient fine-tuning framework.
- Google Colab for GPU resources.
Citation
If you use this model in your research or project, please cite it as:
@model{llama31b_instruct_finetuned,
title={Fine-Tuned LLaMA-3.2-1B-Instruct},
author={Your Name},
year={2024},
url={https://huggingface.co/your-huggingface-repo/llama-3.2-1b-instruct-finetuned},
}
Licensing
This model is released under the Apache 2.0 License. See LICENSE for details.
Tags:llama fine-tuning math coding logical-reasoning instruction-following transformers
Summary:
A fine-tuned version of LLaMA-3.2-1B-Instruct specializing in logical reasoning, math problem-solving, and code generation. Perfect for AI-driven tutoring, programming assistance, and logical problem-solving tasks.
- Downloads last month
- 552
Model tree for ai-nexuz/llama-3.2-1b-instruct-fine-tuned
Base model
meta-llama/Llama-3.2-1B-Instruct