AerialEye YOLOv11-Nano (Aerial & Disaster Response with SAHI Slicing)

Download Model Weights

Quick Start: How to Use this Model

You can download the model weights directly or load them programmatically in Python.

πŸ›œ Direct Download Links

You can also download them via terminal using wget or curl:

# Download PyTorch weights
wget https://huggingface.co/kilanisainikhil/AerialEye/resolve/main/aerialEye.pt

# Download ONNX export
wget https://huggingface.co/kilanisainikhil/AerialEye/resolve/main/aerialEye.onnx

🐍 Load programmatically in Python (huggingface_hub)

Install dependencies:

pip install huggingface_hub ultralytics

Load and run inference in your Python script:

from huggingface_hub import hf_hub_download
from ultralytics import YOLO

# 1. Download the weights automatically from Hugging Face Hub
model_path = hf_hub_download(repo_id="kilanisainikhil/AerialEye", filename="aerialEye.pt")

# 2. Load the model using Ultralytics YOLO
model = YOLO(model_path)

# 3. Run inference on an image
results = model("sample_image.jpg")
results[0].show()

Model Details

  • Model Name: AerialEye (Fine-tuned SUTRA YOLOv11-Nano)
  • Architecture: YOLOv11-N (Nano) + SAHI (Slicing Aided Hyper Inference)
  • Task: Object Detection
  • Domain: High-altitude aerial and drone imagery
  • Deployment Target: Edge hardware (Google Coral Edge TPU, low-power drones) via INT8 Quantization.
  • Previous Architecture: YOLOv8n (Upgraded to YOLOv11-Nano for better performance, faster processing, and higher accuracy)
  • Slicing Strategy: SAHI (Slicing Aided Hyper Inference) integration to detect small objects (humans, vehicles) from drone altitudes.

Intended Use

The AerialEye model is designed to detect critical objects from an aerial perspective to assist in emergency response, infrastructure assessment, and disaster management.

It is capable of rapidly identifying 6 specific classes: 0. human (Search and Rescue)

  1. sos (Distress Signals)
  2. vehicle (Traffic / Evacuation)
  3. flood (Water Level Assessment)
  4. road_damage (Infrastructure Integrity)
  5. crack (Structural Integrity)

Dataset & Training

The model was trained on a highly curated, unified dataset of 6,327 images, which consists of:

  • High-altitude diverse drone frames from VisDrone (vehicles and humans).
  • Custom generated synthetic data perfectly capturing disaster-specific classes (SOS, Damage, Crack, Flood).
  • Curated FloodNet images serving as negative background samples to drastically reduce false positives over flooded terrain.

Training Augmentations:

  • Simulated drone pitch/yaw (15-degree rotation).
  • Vertical & Horizontal flips to account for aerial orientation invariance.
  • Mosaic augmentations.

Performance & Optimization

This model serves as the Phase 1 Low-Res Pass in the perception pipeline. For inference on very small objects, the pipeline integrates SAHI (Slicing Aided Hyper Inference) to dynamically slice suspect regions into higher-resolution patches ($640 \times 640$).

Export Format

The exported weights are fully optimized for:

  • PyTorch (.pt): For standard inference.
  • ONNX (.onnx): For cross-platform deployment.
  • INT8 Quantization (TFLite): To maximize frames-per-second (FPS) on the Google Coral Edge TPU.

Evaluation & Simulation

A side-by-side simulation comparing standard full-frame downscaling ($640 \times 640$) and Slicing Aided Hyper Inference (SAHI) was executed on validation frames.

Metric Standard (Downscaled) SAHI (Sliced Window) Delta / Change
Objects Detected 65 50 -15 (-23.1%)
Inference Latency 733.0 ms 293.6 ms -439.3 ms
Resolution Processing 640x640 (Downscaled) Multi-Tile Slicing (Full Scale) SAHI preserves pixel density

Visual Comparison Map

Standard inference (left, blue) vs. SAHI sliced inference (right, green): SUTRA Standard vs SAHI Comparison

Tactical Impact

SAHI successfully resolved duplicate detections and double-counts (reducing duplicate detections by 15 objects / 23.1%) through its overlapping slice merging NMS layer. This eliminates false positives and double-counting errors commonly made by standard downscaled inference over complex aerial grids.

Getting Started & Usage

1. Installation

Clone this repository and install the dependencies in a virtual environment:

# Clone the repository
git clone https://huggingface.co/kilanisainikhil/AerialEye
cd AerialEye

# Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate

# Install required packages
pip install -r requirements.txt

2. Model Downloads

Since the model weights are stored via Git LFS on Hugging Face, cloning the repository without Git LFS will only download small pointer files. You can retrieve the full model weights using either of the following options:

Option A: Python Downloader (Recommended)

We provide a lightweight Python downloader script download_model.py which downloads the actual weights and samples directly from Hugging Face resolve servers:

# Download default model weights (aerialEye.pt, best.pt) and comparison graphics:
python download_model.py

# Download ALL assets (ONNX, TFLite models, and all sample images):
python download_model.py --all

# Download specific files:
python download_model.py --files aerialEye.onnx best.onnx

Option B: Shell Script Downloader

Alternatively, you can run the provided bash script to fetch the weights using wget:

chmod +x download_weights.sh
./download_weights.sh

3. Running ONNX Export

To export the PyTorch model to ONNX yourself:

python -c "from ultralytics import YOLO; model = YOLO('aerialEye.pt'); model.export(format='onnx')"

4. Running Simulation & Inference

Compare standard full-frame YOLO inference against Slicing Aided Hyper Inference (SAHI) using the simulation script:

# Run the simulation on the default sample image:
python simulate_sahi.py --image sample_aerial_street.jpg --model aerialEye.pt

# Run the simulation on other sample images:
python simulate_sahi.py --image sample_drone_roundabout.jpg

# Run the simulation with custom slicing parameters:
python simulate_sahi.py --image sample_aerial_street.jpg --slice-size 640 --overlap 0.25

This script generates side-by-side visualization maps:

  • result_standard.jpg (standard full-frame inference)
  • result_sahi.jpg (SAHI slicing inference with overlapping window merging)

5. Running the Gradio Web Interface

To launch the interactive web interface:

python app.py
Downloads last month
95
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train kilanisainikhil/AerialEye