---
license: openrail
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- startup
- business
- advisor
- unsloth
- qlora
- instruction-tuning
---

# 🚀 Startup Advisor AI (Qwen2.5-7B Fine-Tuned)

**Startup Advisor AI** is a highly specialized, fine-tuned language model designed to act as an expert startup consultant. Built by distilling knowledge from core business literature (e.g., *The Lean Startup*, *Zero to One*, *Blitzscaling*), it transforms generic LLM responses into highly structured, actionable, and framework-grounded advice for founders.

Developed by **Adam Abu Hamdan** for the **AMD Developer Hackathon**.

## 🧠 Model Details
- **Base Model:** `Qwen/Qwen2.5-7B-Instruct`
- **Fine-Tuning Framework:** Unsloth & TRL (SFTTrainer)
- **Quantization:** QLoRA (4-bit precision)
- **Hardware Used:** NVIDIA T4 GPU (Google Colab)
- **Dataset:** 1,564 high-quality ChatML instructions meticulously crafted to simulate an expert human advisor.

## ✨ Before vs. After Fine-Tuning

To understand the impact of this fine-tuning, observe how the model shifted from giving generic, "wiki-style" lists to delivering sharp, structured insights with real-world examples.

**Prompt:** > *I'm at the Scaling stage of building my startup. Give me actionable advice on blitzscaling speed.*

### ❌ Before (Raw Qwen2.5-7B-Instruct):
> "Blitzscaling is a strategy for rapid growth and scaling that emphasizes moving quickly to capture market share before competitors can respond. Here are some actionable steps...
> 1. Define Your North Star Metrics...
> 2. Build a High-Growth Team...
> 3. Prioritize Customer Feedback..." *(Generic, verbose)*

### ✅ After (Startup Advisor AI):
> "Prioritize speed over efficiency in blitzscaling—cut cycles, ship faster, and tolerate short-term waste for long-term dominance.
> 
> **Why this matters:** Efficiency can be rebuilt later; speed defines who survives hypergrowth.
> 
> **Real-world example (2025):** Vercel shipped AI preview deployments instantly even with unfinished edge layers.
> 
> **Avoid:** `slow_cycles`, `perfecting_features`, `overanalyzing_risks`" *(Actionable, structured, framework-aligned)*

## 💻 How to Use (Inference Code)

You can run this model efficiently using `transformers` and `peft`:

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model_id = "Qwen/Qwen2.5-7B-Instruct"
adapter_id = "adamabuhamdan/startup-advisor-qwen2.5-7b-lora"

# Load with 4-bit quantization for memory efficiency
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16
)

print("Loading base model...")
base_model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

print("Applying LoRA adapters...")
model = PeftModel.from_pretrained(base_model, adapter_id)

# Inference
messages = [
    {"role": "system", "content": "You are an expert startup and business advisor. Provide concise, actionable advice grounded in proven frameworks and real-world examples from leading companies."},
    {"role": "user", "content": "How do I validate my SaaS idea without writing code?"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant\n")[-1].strip())