--- license: apache-2.0 library_name: lerobot pipeline_tag: robotics tags: - robotics - imitation-learning - mujoco - lerobot - act - so101 --- # ACT SO101 PickCube (chunk40, 250k) — v1 This repository contains an **Action Chunking Transformer (ACT)** policy trained for the SO-101 MuJoCo simulation pick-and-place task: **pick up the red cube and place it in the blue bin**. ## Task - **Environment**: `SO101PickCube-v0` (`lerobot.envs.so101_sim`) - **Randomization**: randomized cube position + randomized drop-zone/bin position ## Inputs / Outputs - **Observations** - `observation.images.front`: RGB image (3×128×128) - `observation.images.wrist`: RGB image (3×128×128) - `observation.state`: 10D state vector (`agent_pos` from the env) - **Action**: 4D action `[dx, dy, dz, gripper]` - `dx, dy, dz` are end-effector delta commands in \([-1, 1]\) - `gripper` is in \([0, 2]\) where `0=open`, `1=stay`, `2=close` ## Normalization This policy was trained with: - **Images**: ImageNet mean/std - **State & Action**: MEAN_STD using the included stats files: - `policy_preprocessor_step_3_normalizer_processor.safetensors` - `policy_postprocessor_step_0_unnormalizer_processor.safetensors` ## Model - **Policy**: ACT (Action Chunking Transformer) - **Vision backbone**: ResNet-18 (`ResNet18_Weights.IMAGENET1K_V1`) - **Chunking**: `chunk_size=40`, `n_action_steps=40` - **Transformer**: `dim_model=512`, `n_heads=8`, `n_encoder_layers=4`, `n_decoder_layers=1` - **VAE**: enabled (`latent_dim=32`, `kl_weight=10.0`) ## Training Key settings (see `train_config.json` for full config): - **Steps**: 250,000 - **Batch size**: 8 - **Optimizer**: AdamW (`lr=1e-5`, `weight_decay=1e-4`) - **Dataset**: recorded locally as `local/so101_safe_worker1` during training (not published on the Hub) ## Usage ### Load ```python from lerobot.policies.act.modeling_act import ACTPolicy policy = ACTPolicy.from_pretrained("gpudad/act_so101_chunk40_250k_v1") policy.eval() ``` ### Evaluate in the SO101 simulator If you’re using the evaluation scripts in this repo (like `eval_so101.py`), download the snapshot locally first so the script can read `config.json` and the normalization stats files: ```python from huggingface_hub import snapshot_download local_dir = snapshot_download("gpudad/act_so101_chunk40_250k_v1") print(local_dir) ``` Then run: ```bash python eval_so101.py --model --episodes 10 --max-steps 250 --no-viewer ``` Note: `SO101PickCube-v0` reports success as `info["succeed"]`. ## Limitations - This policy is intended for the specific observation layout + environment settings used by `SO101PickCube-v0`. - Performance can vary with MuJoCo version, rendering settings, and random seeds. ## Citation If you use ACT, please cite: ```bibtex @article{zhao2023learning, title = {Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware}, author = {Zhao, Tony Z. and others}, journal = {arXiv preprint arXiv:2304.13705}, year = {2023} } ```