How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for hotdogs/frankenmoe to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for hotdogs/frankenmoe to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for hotdogs/frankenmoe to start chatting
Quick Links

πŸ§ͺ FrankenMoE β€” Proof of Concept (NOT production)

This is a technical experiment, not a useful model.

⚠️ Important Warning

This repository documents a proof-of-concept MoE pipeline. The model quality is NOT good β€” it produces incoherent / random outputs because:

  1. The router uses (no training)
  2. Experts were fine-tuned with only ~5K samples each
  3. Base model is only Qwen2.5-1.5B-Instruct

Do NOT use this model for anything serious. It exists purely to demonstrate that the FrankenMoE pipeline can be built end-to-end.

What We Actually Built

A working MoE pipeline from dense LoRA experts β†’ GGUF:

Qwen2.5-1.5B-Instruct (base)
  β”œβ”€β”€ Expert 0: Coding (LoRA fine-tuned)
  β”œβ”€β”€ Expert 1: Math (LoRA fine-tuned)  
  └── Shared Expert: Base model

Key Technical Discoveries

Discovery Detail
mergekit 0.1.4 bug param incompatible with transformers >= 4.40 β€” must patch
QwenMoE requirements Exactly 1 shared expert + 2^n routed experts (2, 4, 8)
Tied embeddings fix Qwen2.5 uses tied embeddings β†’ must clone β†’ before GGUF conversion, set
LoRA must be merged Adapters must be before MoE assembly

Repository Structure

πŸ“¦ frankenmoe_moe_v2-F16.gguf  β€” MoE GGUF (fixed, has output.weight)
πŸ“ moe_full/                   β€” Full safetensors model
πŸ“ coding/ math/ chat/         β€” Individual dense experts (LoRA + GGUF)
πŸ“„ FrankenMoE_Academic_Paper.pdf β€” Research paper
🐍 simple_router.py            β€” Keyword-based router (functional alternative)

Quick Test

wget https://huggingface.co/hotdogs/frankenmoe/resolve/main/frankenmoe_moe_v2-F16.gguf
llama-cli -m frankenmoe_moe_v2-F16.gguf -p "Write a Python function"
# Output: Random/incoherent β€” this is expected! See warning above.

Future: Real Model

The pipeline will be re-run with:

  • Larger base model (Qwen2.5-7B/14B)
  • Trained router (classification loss)
  • More training data per domain
  • 4 experts for proper 2^n routing

Stay tuned β€” the real model is coming.


Built by UKA πŸ‡ΉπŸ‡­ | May 2026

Downloads last month
285
GGUF
Model size
4B params
Architecture
qwen2moe
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support