--- license: mit tags: - reinforcement-learning - stable-baselines3 - mujoco - ant-v4 - ppo pipeline_tag: reinforcement-learning library_name: stable-baselines3 model_name: PPO-Ant-v4 --- # PPO - Ant-v4 🌟 A Proximal Policy Optimization (PPO) agent trained with **stable-baselines3** on the MuJoCo **`Ant-v4`** environment. | | Details | |---|---| | Environment | `gymnasium==0.29` & `mujoco==2.3` (`Ant-v4`) | | Algorithm | PPO (`stable-baselines3==2.3.0`) | | Timesteps | **100 000** | | Policy | `MlpPolicy` *(2 × 64 hidden, tanh)* | | Return (mean ± std) | ~ *964* | | Seed | `0` | ## Hyper-parameters ```jsonc { "n_steps": 128, "batch_size": 64, "n_epochs": 20, "gamma": 0.99, "learning_rate": 3e-4, "ent_coef": 0.0, "clip_range": 0.2 }