Instructions to use eoinedge/arduino-edgeai-qwen-combined-0.5b-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use eoinedge/arduino-edgeai-qwen-combined-0.5b-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct") model = PeftModel.from_pretrained(base_model, "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora") - Transformers
How to use eoinedge/arduino-edgeai-qwen-combined-0.5b-lora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="eoinedge/arduino-edgeai-qwen-combined-0.5b-lora") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("eoinedge/arduino-edgeai-qwen-combined-0.5b-lora", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use eoinedge/arduino-edgeai-qwen-combined-0.5b-lora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/eoinedge/arduino-edgeai-qwen-combined-0.5b-lora
- SGLang
How to use eoinedge/arduino-edgeai-qwen-combined-0.5b-lora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use eoinedge/arduino-edgeai-qwen-combined-0.5b-lora with Docker Model Runner:
docker model run hf.co/eoinedge/arduino-edgeai-qwen-combined-0.5b-lora
Arduino + Edge AI Combined Qwen 0.5B LoRA Adapter
This adapter fine-tunes Qwen/Qwen2.5-Coder-0.5B-Instruct on a combined Arduino + Edge AI corpus to provide a single offline-capable model for both Arduino documentation guidance and Edge AI tooling knowledge.
Model Details
- Model type: LoRA adapter
- Base model:
Qwen/Qwen2.5-Coder-0.5B-Instruct - Adapter size: 0.5B
- Training corpora:
subset/subset_arduino+subset/subset_edgeai - Task: text generation / coding assistance
- Library:
peft - Pipeline:
text-generation
Intended Use
This adapter is intended for users who need offline or local inference with both Arduino documentation knowledge and Edge AI reference material in one model.
Training Details
- Training strategy: LoRA fine-tuning on combined subset corpora
- Hyperparameters: 3 epochs, batch size 2, gradient accumulation 4, learning rate 1e-4
- Data processed: combined subset of Arduino docs and Mintlify Edge AI docs
- Output directory:
adapter-combined-subset-0.5b
Evaluation
QA Evaluation
- Dataset: 10 fixed Arduino-domain questions
- Base model avg response time: 6.41s
- Adapter avg response time: 6.57s
- Base keyword hits avg: 7.0
- Adapter keyword hits avg: 5.5
- Base code snippet presence: 9/10
- Adapter code snippet presence: 1/10
- Result file:
eval_combined_0.5b_results.csv
Perplexity Evaluation
- Test corpus:
subset/subset_edgeai(121 docs) - Base mean PPL: 9.46
- Adapter mean PPL: 6.65
- Mean delta: -2.81
- Adapter wins: 117 / 121 files
- Result file:
ppl_combined_0.5b_subset_edgeai_results.csv
Notes
- Lower perplexity indicates the adapter is more confident on Edge AI domain text.
- The QA evaluation is a small fixed question set, and the perplexity results are the deeper numeric signal for domain adaptation.
References
- Offline SLMs for Edge AI Development โ Part 1: Qwen LoRA Adapter Fine-Tuned on Edge Impulse Docs
- Offline SLMs for Edge AI Development โ Part 2: RAG as an Enhancement for Fine-Tuned Models with FAISS
- Offline SLMs for Edge AI Development โ Part 3: Agentic Coding with an Arduino Fine-Tuned Adapter via llama.cpp and OpenCode
How to use
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import PeftModel
base = "Qwen/Qwen2.5-Coder-0.5B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base)
model = PeftModel.from_pretrained(model, "eoinedge/arduino-edgeai-qwen-combined-0.5b-lora")
model = model.merge_and_unload()
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device="cpu")
print(pipe("How do I read a DHT11 sensor in Arduino?", max_new_tokens=200)[0]["generated_text"])
License
This adapter is derived from the base model and follows the licensing terms of the base model and related adapter resources.
- Downloads last month
- 28
Model tree for eoinedge/arduino-edgeai-qwen-combined-0.5b-lora
Base model
Qwen/Qwen2.5-0.5B
docker model run hf.co/eoinedge/arduino-edgeai-qwen-combined-0.5b-lora