Instructions to use monarch8661/moe with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use monarch8661/moe with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="monarch8661/moe") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("monarch8661/moe", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Heritage Temple Damage Assessment – Mixture-of-Experts (MoE)
Model Description
This is a Mixture-of-Experts (MoE) ensemble for automatically assessing structural damage in heritage temple images. It combines four pre‑trained expert models:
- ResNet50 – texture‑sensitive, good for fine cracks and surface damage.
- EfficientNet‑B4 – balanced accuracy/speed, robust to varying image quality.
- ViT‑Base (patch16_224) – captures global context and structural deformations.
- YOLO fallback CNN – a lightweight custom CNN that acts as a robust fallback for heavily corrupted or low‑resolution images.
A learned gating network dynamically weights the experts’ contributions per image. The final output is one of three damage classes:
| Class | Criticality Grade |
|---|---|
| Undamaged | STABLE |
| Partial Damage | MINOR |
| Damaged | CRITICAL |
The model also outputs per‑expert predictions, gate weights, and a continuous confidence score. A fallback chain (gate → uniform ensemble → mock) guarantees robustness in production.
Intended Uses & Limitations
Intended use: Automated preliminary damage screening for heritage site managers, conservation architects, and NGOs. The model is designed for images captured by drones, phones, or archival photographs (visible spectrum).
Limitations:
- The training set is moderately imbalanced (fewer “Damaged” samples). Performance on rare damage types (e.g., severe spalling) may be lower.
- The model was trained on a combination of publicly available damage datasets (concrete cracks, disaster infrastructure, surface cracks). It may not generalise equally to all temple architectures (e.g., brick vs. stone).
- Very low‑resolution (< 224×224) or heavily compressed images degrade accuracy.
- The model does not provide a continuous severity score; only discrete classes (future work).
Training Data
The model was fine‑tuned on a curated dataset of ~4,800 training images aggregated from:
- Concrete crack images (classification)
- Surface crack detection
- Disaster infrastructure damage (CDD)
- Building damage assessment datasets
- QuakeSet (limited, due to access restrictions)
Images were resized to 224×224, augmented (random crop, flip, rotate, colour jitter, coarse dropout), and split 70/15/15 for training/validation/test. Class‑weighted sampling and focal loss were used to handle imbalance.
Training Procedure
All experts were initialised with ImageNet‑1k weights and fine‑tuned for 25 epochs (5 frozen backbone, 20 unfrozen). The gating network was trained for 15 epochs on frozen experts, using cross‑entropy + 0.01× load‑balancing loss. Gradient accumulation (effective batch 64), EMA, and mixup were applied. Training was done on a single Tesla T4 GPU (Kaggle).
Evaluation Results
On the held‑out test set (1,028 images):
| Metric | Value |
|---|---|
| Accuracy | 0.9850 |
| Weighted F1 | 0.9853 |
| Per‑class F1 (Undamaged) | 0.99 |
| Per‑class F1 (Partial) | 1.00 |
| Per‑class F1 (Damaged) | 0.95 |
Expert‑only performance (test F1):
- ResNet50: 0.9467
- EfficientNet‑B4: 0.9641
- ViT‑B16: 0.9792
- YOLO fallback: 0.6278
The MoE ensemble outperforms every individual expert, demonstrating the benefit of adaptive weighting.
How to Use
The model is hosted on Hugging Face Hub and requires trust_remote_code=True because it includes a custom MoE architecture.
from transformers import AutoModelForImageClassification
from PIL import Image
import requests
# Load model from Hub
model = AutoModelForImageClassification.from_pretrained(
"monarch8661/moe",
trust_remote_code=True
)
# Load and preprocess an image
url = "https://example.com/temple_damage.jpg"
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
# Run inference (returns a dict with all details)
outputs = model(image)
print(outputs["predicted_class"]) # e.g., "Partial Damage"
print(outputs["criticality"]) # "MINOR"
print(outputs["confidence"]) # 0.92
print(outputs["gate_weights"]) # [0.21, 0.45, 0.30, 0.04]
print(outputs["per_expert"]) # list of expert predictions