--- license: openrail base_model: Qwen/Qwen2.5-7B-Instruct tags: - startup - business - advisor - unsloth - qlora - instruction-tuning --- # 🚀 Startup Advisor AI (Qwen2.5-7B Fine-Tuned) **Startup Advisor AI** is a highly specialized, fine-tuned language model designed to act as an expert startup consultant. Built by distilling knowledge from core business literature (e.g., *The Lean Startup*, *Zero to One*, *Blitzscaling*), it transforms generic LLM responses into highly structured, actionable, and framework-grounded advice for founders. Developed by **Adam Abu Hamdan** for the **AMD Developer Hackathon**. ## 🧠 Model Details - **Base Model:** `Qwen/Qwen2.5-7B-Instruct` - **Fine-Tuning Framework:** Unsloth & TRL (SFTTrainer) - **Quantization:** QLoRA (4-bit precision) - **Hardware Used:** NVIDIA T4 GPU (Google Colab) - **Dataset:** 1,564 high-quality ChatML instructions meticulously crafted to simulate an expert human advisor. ## ✨ Before vs. After Fine-Tuning To understand the impact of this fine-tuning, observe how the model shifted from giving generic, "wiki-style" lists to delivering sharp, structured insights with real-world examples. **Prompt:** > *I'm at the Scaling stage of building my startup. Give me actionable advice on blitzscaling speed.* ### ❌ Before (Raw Qwen2.5-7B-Instruct): > "Blitzscaling is a strategy for rapid growth and scaling that emphasizes moving quickly to capture market share before competitors can respond. Here are some actionable steps... > 1. Define Your North Star Metrics... > 2. Build a High-Growth Team... > 3. Prioritize Customer Feedback..." *(Generic, verbose)* ### ✅ After (Startup Advisor AI): > "Prioritize speed over efficiency in blitzscaling—cut cycles, ship faster, and tolerate short-term waste for long-term dominance. > > **Why this matters:** Efficiency can be rebuilt later; speed defines who survives hypergrowth. > > **Real-world example (2025):** Vercel shipped AI preview deployments instantly even with unfinished edge layers. > > **Avoid:** `slow_cycles`, `perfecting_features`, `overanalyzing_risks`" *(Actionable, structured, framework-aligned)* ## 💻 How to Use (Inference Code) You can run this model efficiently using `transformers` and `peft`: ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel base_model_id = "Qwen/Qwen2.5-7B-Instruct" adapter_id = "adamabuhamdan/startup-advisor-qwen2.5-7b-lora" # Load with 4-bit quantization for memory efficiency bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16 ) print("Loading base model...") base_model = AutoModelForCausalLM.from_pretrained(base_model_id, quantization_config=bnb_config, device_map="auto") tokenizer = AutoTokenizer.from_pretrained(base_model_id) print("Applying LoRA adapters...") model = PeftModel.from_pretrained(base_model, adapter_id) # Inference messages = [ {"role": "system", "content": "You are an expert startup and business advisor. Provide concise, actionable advice grounded in proven frameworks and real-world examples from leading companies."}, {"role": "user", "content": "How do I validate my SaaS idea without writing code?"} ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda") outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("assistant\n")[-1].strip())