🚀 Startup Advisor AI (Qwen2.5-7B Fine-Tuned)

Startup Advisor AI is a highly specialized, fine-tuned language model designed to act as an expert startup consultant. Built by distilling knowledge from core business literature (e.g., The Lean Startup, Zero to One, Blitzscaling), it transforms generic LLM responses into highly structured, actionable, and framework-grounded advice for founders.

Developed by Adam Abu Hamdan for the AMD Developer Hackathon.

🧠 Model Details

Base Model: Qwen/Qwen2.5-7B-Instruct
Fine-Tuning Framework: Unsloth & TRL (SFTTrainer)
Quantization: QLoRA (4-bit precision)
Hardware Used: NVIDIA T4 GPU (Google Colab)
Dataset: 1,564 high-quality ChatML instructions meticulously crafted to simulate an expert human advisor.

✨ Before vs. After Fine-Tuning

To understand the impact of this fine-tuning, observe how the model shifted from giving generic, "wiki-style" lists to delivering sharp, structured insights with real-world examples.

Prompt: > I'm at the Scaling stage of building my startup. Give me actionable advice on blitzscaling speed.

❌ Before (Raw Qwen2.5-7B-Instruct):

"Blitzscaling is a strategy for rapid growth and scaling that emphasizes moving quickly to capture market share before competitors can respond. Here are some actionable steps...

Define Your North Star Metrics...

Build a High-Growth Team...

Prioritize Customer Feedback..." (Generic, verbose)

✅ After (Startup Advisor AI):

"Prioritize speed over efficiency in blitzscaling—cut cycles, ship faster, and tolerate short-term waste for long-term dominance.

Why this matters: Efficiency can be rebuilt later; speed defines who survives hypergrowth.

Real-world example (2025): Vercel shipped AI preview deployments instantly even with unfinished edge layers.

Avoid: slow_cycles, perfecting_features, overanalyzing_risks" (Actionable, structured, framework-aligned)

💻 How to Use (Inference Code)

You can run this model efficiently using transformers and peft:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model_id = "Qwen/Qwen2.5-7B-Instruct"
adapter_id = "adamabuhamdan/startup-advisor-qwen2.5-7b-lora"

# Load with 4-bit quantization for memory efficiency
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16
)

print("Loading base model...")
base_model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

print("Applying LoRA adapters...")
model = PeftModel.from_pretrained(base_model, adapter_id)

# Inference
messages = [
    {"role": "system", "content": "You are an expert startup and business advisor. Provide concise, actionable advice grounded in proven frameworks and real-world examples from leading companies."},
    {"role": "user", "content": "How do I validate my SaaS idea without writing code?"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant\n")[-1].strip())