---
library_name: stable-baselines3
tags:
  - reinforcement-learning
  - uav-navigation
  - landuse
  - dqn
  - gymnasium
license: mit
---

# HASP: Heuristic-Augmented Static Policy (Phase I)

## About this Model

This model is a supplementary artifact for the CTU FIT master's thesis
**"Information-Driven Visual Navigation for UAVs using Deep Reinforcement Learning"** (Vojtěch Bešťák, 2026).

It is trained with
[information-driven-uav-navigation](https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation),
a research framework for visual UAV navigation using RL/IL over aerial orthophoto
and land-use grid environments.

| Key | Value |
|-----|-------|
| **Algorithm** | DQN |
| **Environment** | `landuse_grid` |
| **W&B group** | `exp1.3-dqn_2` |
| **Run ID** | `1nt3s4ta` |
| **Reward weights** | dist=0.862, view=2.201, time=-0.100, success=57.4 |
| **Max steps / episode** | 300 |
| **Frame stack** | 4 |
| **Seed** | 0 |
| **Python (training)** | 3.11.14 |
| **stable-baselines3** | 2.2.1 |
| **PyTorch** | 2.10.0 |
| **Gymnasium** | 0.29.1 |
| **Arena size** | 60×60 cells |
| **Camera size** | 11×11 cells |

## Quantitative Performance

The following results represent the mean performance over 500 deterministic validation episodes against a privileged Dijkstra Oracle.

| Metric | Value |
|--------|-------|
| Success Rate | 99.7% |
| Optimality Score | 0.820 |
| Feature Capture | 0.821 |

## Observation and Action Spaces

| Key | Shape | dtype | Notes |
|-----|-------|-------|-------|
| `camera` | `(4, 11, 11)` | float32 | Land-use edge-map crop |
| `visited_mask` | `(4, 11, 11)` | float32 | Cells visited in the current episode crop |
| `goal_info` | `(24,)` | float32 | `Telemetry and navigation state` |

**Action space**: `Discrete(4)` — `0=right, 1=down, 2=left, 3=up`

## Standalone Inference (no information-driven-uav-navigation package required)

Install the minimal deps (versions match the training environment):

```bash
pip install "stable-baselines3==2.2.1" "torch==2.10.0" "gymnasium==0.29.1" huggingface-hub
```

```python
import importlib.util, sys, numpy as np
from huggingface_hub import hf_hub_download
from stable_baselines3 import DQN

REPO_ID = "bestak/uav-navigation-hasp"

# 1. Download and load the feature extractor (pure PyTorch, no repo import needed)
fe_path = hf_hub_download(REPO_ID, "feature_extractor.py")
spec = importlib.util.spec_from_file_location("_fe", fe_path)
mod  = importlib.util.module_from_spec(spec)
sys.modules["_fe"] = mod
spec.loader.exec_module(mod)

# 1b. Stub out drone_navigation so cloudpickle can resolve ALL saved class references
#     (lr_schedule, policy_kwargs, etc.) without the package being installed.
import types as _types
for _name in ["drone_navigation", "drone_navigation.models",
              "drone_navigation.models.feature_extractor_aerial",
              "drone_navigation.models.feature_extractor_landuse"]:
    sys.modules.setdefault(_name, _types.ModuleType(_name))
sys.modules["drone_navigation.models.feature_extractor_landuse"].LanduseFeaturesExtractor = mod.LanduseFeaturesExtractor

# 2. Load the model -- inject the extractor class so cloudpickle can resolve it
model = DQN.load(
    hf_hub_download(REPO_ID, "best_model.zip"),
    custom_objects={
        "features_extractor_class": mod.LanduseFeaturesExtractor,
    },
    device="cpu",
)

# 3. Run a single forward pass with a dummy observation
obs = {
    "camera":       np.zeros((4, 11, 11), dtype=np.float32),
    "visited_mask": np.zeros((4, 11, 11), dtype=np.float32),
    "goal_info":    np.zeros(24, dtype=np.float32),
}
action, _ = model.predict(obs, deterministic=True)
print("Action:", action)
```

> **Note**: `custom_objects` overrides the cloudpickled class reference, which is why
> the `information-driven-uav-navigation` package is not required for loading. See `inference.py`
> in this repo for the full example including environment rollouts.

## Full Inference with the information-driven-uav-navigation Package

```bash
pip install git+https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation.git
# also requires map data -- see the repo README for data preparation
```

```python
from huggingface_hub import hf_hub_download
from stable_baselines3 import DQN
from drone_navigation.config.experiment_config import ExperimentConfig
from drone_navigation.envs.factory import create_env

REPO_ID = "bestak/uav-navigation-hasp"

# When drone_navigation is installed, the extractor class resolves automatically
model = DQN.load(hf_hub_download(REPO_ID, "best_model.zip"), device="cpu")

cfg = ExperimentConfig.from_json(hf_hub_download(REPO_ID, "config.json"))
cfg.n_envs = 1
env = create_env(cfg)

obs, _ = env.reset()
for _ in range(cfg.max_steps):
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        print("Episode done. Target reached:", info.get("is_target_reached"))
        break

env.close()
```

See `inference.py` in this repo for a more complete example with multi-episode evaluation.

## Training

```bash
git clone https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation.git
cd information-driven-uav-navigation
uv sync
drone-train-rl --env_type landuse_grid ...
```

## Citation

If you use this model, please cite the original repository:

```bibtex
@misc{information-driven-uav-navigation,
  author = {Bestak, Vojtech},
  title  = {Information-Driven Visual Navigation for UAVs using Deep Reinforcement Learning},
  year   = {2026},
  url    = {https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation}
}
```