--- library_name: stable-baselines3 tags: - reinforcement-learning - uav-navigation - landuse - dqn - gymnasium license: mit --- # HASP: Heuristic-Augmented Static Policy (Phase I) ## About this Model This model is a supplementary artifact for the CTU FIT master's thesis **"Information-Driven Visual Navigation for UAVs using Deep Reinforcement Learning"** (Vojtěch Bešťák, 2026). It is trained with [information-driven-uav-navigation](https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation), a research framework for visual UAV navigation using RL/IL over aerial orthophoto and land-use grid environments. | Key | Value | |-----|-------| | **Algorithm** | DQN | | **Environment** | `landuse_grid` | | **W&B group** | `exp1.3-dqn_2` | | **Run ID** | `1nt3s4ta` | | **Reward weights** | dist=0.862, view=2.201, time=-0.100, success=57.4 | | **Max steps / episode** | 300 | | **Frame stack** | 4 | | **Seed** | 0 | | **Python (training)** | 3.11.14 | | **stable-baselines3** | 2.2.1 | | **PyTorch** | 2.10.0 | | **Gymnasium** | 0.29.1 | | **Arena size** | 60×60 cells | | **Camera size** | 11×11 cells | ## Quantitative Performance The following results represent the mean performance over 500 deterministic validation episodes against a privileged Dijkstra Oracle. | Metric | Value | |--------|-------| | Success Rate | 99.7% | | Optimality Score | 0.820 | | Feature Capture | 0.821 | ## Observation and Action Spaces | Key | Shape | dtype | Notes | |-----|-------|-------|-------| | `camera` | `(4, 11, 11)` | float32 | Land-use edge-map crop | | `visited_mask` | `(4, 11, 11)` | float32 | Cells visited in the current episode crop | | `goal_info` | `(24,)` | float32 | `Telemetry and navigation state` | **Action space**: `Discrete(4)` — `0=right, 1=down, 2=left, 3=up` ## Standalone Inference (no information-driven-uav-navigation package required) Install the minimal deps (versions match the training environment): ```bash pip install "stable-baselines3==2.2.1" "torch==2.10.0" "gymnasium==0.29.1" huggingface-hub ``` ```python import importlib.util, sys, numpy as np from huggingface_hub import hf_hub_download from stable_baselines3 import DQN REPO_ID = "bestak/uav-navigation-hasp" # 1. Download and load the feature extractor (pure PyTorch, no repo import needed) fe_path = hf_hub_download(REPO_ID, "feature_extractor.py") spec = importlib.util.spec_from_file_location("_fe", fe_path) mod = importlib.util.module_from_spec(spec) sys.modules["_fe"] = mod spec.loader.exec_module(mod) # 1b. Stub out drone_navigation so cloudpickle can resolve ALL saved class references # (lr_schedule, policy_kwargs, etc.) without the package being installed. import types as _types for _name in ["drone_navigation", "drone_navigation.models", "drone_navigation.models.feature_extractor_aerial", "drone_navigation.models.feature_extractor_landuse"]: sys.modules.setdefault(_name, _types.ModuleType(_name)) sys.modules["drone_navigation.models.feature_extractor_landuse"].LanduseFeaturesExtractor = mod.LanduseFeaturesExtractor # 2. Load the model -- inject the extractor class so cloudpickle can resolve it model = DQN.load( hf_hub_download(REPO_ID, "best_model.zip"), custom_objects={ "features_extractor_class": mod.LanduseFeaturesExtractor, }, device="cpu", ) # 3. Run a single forward pass with a dummy observation obs = { "camera": np.zeros((4, 11, 11), dtype=np.float32), "visited_mask": np.zeros((4, 11, 11), dtype=np.float32), "goal_info": np.zeros(24, dtype=np.float32), } action, _ = model.predict(obs, deterministic=True) print("Action:", action) ``` > **Note**: `custom_objects` overrides the cloudpickled class reference, which is why > the `information-driven-uav-navigation` package is not required for loading. See `inference.py` > in this repo for the full example including environment rollouts. ## Full Inference with the information-driven-uav-navigation Package ```bash pip install git+https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation.git # also requires map data -- see the repo README for data preparation ``` ```python from huggingface_hub import hf_hub_download from stable_baselines3 import DQN from drone_navigation.config.experiment_config import ExperimentConfig from drone_navigation.envs.factory import create_env REPO_ID = "bestak/uav-navigation-hasp" # When drone_navigation is installed, the extractor class resolves automatically model = DQN.load(hf_hub_download(REPO_ID, "best_model.zip"), device="cpu") cfg = ExperimentConfig.from_json(hf_hub_download(REPO_ID, "config.json")) cfg.n_envs = 1 env = create_env(cfg) obs, _ = env.reset() for _ in range(cfg.max_steps): action, _ = model.predict(obs, deterministic=True) obs, reward, terminated, truncated, info = env.step(action) if terminated or truncated: print("Episode done. Target reached:", info.get("is_target_reached")) break env.close() ``` See `inference.py` in this repo for a more complete example with multi-episode evaluation. ## Training ```bash git clone https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation.git cd information-driven-uav-navigation uv sync drone-train-rl --env_type landuse_grid ... ``` ## Citation If you use this model, please cite the original repository: ```bibtex @misc{information-driven-uav-navigation, author = {Bestak, Vojtech}, title = {Information-Driven Visual Navigation for UAVs using Deep Reinforcement Learning}, year = {2026}, url = {https://gitlab.ciirc.cvut.cz/bestavoj/information-driven-uav-navigation} } ```