Robotics
LeRobot
Safetensors
smolvla
dct
isaaclab
so101
phase2

IsaacLab SO101 Phase2 DCT SmolVLA - pick_place 80ep 10fps

This repository contains the call-aligned DCT-tuned SmolVLA checkpoint prepared for Method3 Phase2 acquisition on the 10fps future-action aligned Phase1 pick_place dataset.

Source Dataset

  • Dataset: CoRL2026-CSI/IsaacLab-SO101-Phase1-pick_place-80episode-10fps
  • Local training root: /data/vpraise-corl/workspace/SCRAPE-IsaacLab/results/derived_datasets/IsaacLab-SO101-Phase1-pick_place-80episode-10fps/dataset
  • Episodes: 80
  • Frames: 25,204
  • FPS: 10
  • Downsample alignment: observations/timestamps use source frame 3k; actions use source frame min(3k + 2, episode_end), so the normal 30fps to 10fps mapping is s0 -> a2 -> s3.

Checkpoint Layout

  • pretrained_model/: LeRobot SmolVLA checkpoint directory.
  • uvla_id_stats.json: U_VLA ID-distribution sidecar. Keep this file next to pretrained_model/; ADC resolves it as Path(vla_checkpoint).parent / "uvla_id_stats.json".
  • phase2_prepare/: simulator-free Method3 Phase2 prepare artifacts for Q1 local transport.

For ADC/Phase2, pass the checkpoint as:

--method3-phase2-vla-checkpoint /path/to/repo_snapshot/pretrained_model

or for prepare validation:

--vla-checkpoint /path/to/repo_snapshot/pretrained_model

Training Summary

  • Policy: SmolVLA
  • Base policy: lerobot/smolvla_base
  • Job name: smolvla_dct_pick_place_80ep_10fps_call_aligned_future_action
  • Steps: 2850
  • Batch size: 16
  • Seed: 1000
  • DCT action horizon: 50
  • DCT skill segments: 901
  • Segment distribution: 59 episodes with 11 skills, 21 episodes with 12 skills
  • Skill counts: skill_0..skill_10 each have 80 entries; skill_11 has 21 entries
  • DCT parquet: phase2_prepare/dct/CoRL2026-CSI__IsaacLab-SO101-Phase1-pick_place-80episode-10fps.skill_dct.parquet
  • Vector DB: phase2_prepare/dct/skill_wise_vector_db.npz with 901 entries

Training used the local LeRobot entrypoint:

/opt/isaaclab-env/bin/python -m lerobot.scripts.lerobot_train \
  --policy.path=lerobot/smolvla_base \
  --policy.push_to_hub=false \
  --policy.device=cuda \
  --dataset.repo_id=CoRL2026-CSI/IsaacLab-SO101-Phase1-pick_place-80episode-10fps \
  --dataset.root=/data/vpraise-corl/workspace/SCRAPE-IsaacLab/results/derived_datasets/IsaacLab-SO101-Phase1-pick_place-80episode-10fps/dataset \
  --dataset.revision=v3.0 \
  --dataset.video_backend=pyav \
  --dataset.skill_dct_parquet=/data/vpraise-corl/workspace/SCRAPE-IsaacLab/results/method3_phase2_prepare/pick_place_80ep_10fps/source_session/dct/CoRL2026-CSI__IsaacLab-SO101-Phase1-pick_place-80episode-10fps.skill_dct.parquet \
  --batch_size=16 \
  --steps=2850 \
  --seed=1000 \
  --wandb.enable=false \
  --rename_map='{"observation.images.left_wrist": "observation.images.camera1", "observation.images.top": "observation.images.camera2"}' \
  --dataset.image_transforms.enable=true

Call-Aligned DCT Segmentation

Method3 Phase2 replay is indexed by set_skill_info() call ordinal. The DCT parquet in this package is therefore segmented by skill call stamps instead of only by skill.natural_language run-length. When goal pose columns are present, the boundary key includes:

  • skill.natural_language
  • skill.type
  • skill.goal_position.robot_xyzrpy
  • skill.goal_position.joint
  • skill.goal_position.gripper

This prevents consecutive skill calls with the same natural-language label from being merged. The included prepare validation requires exact agreement between the subgoal buffer, DCT parquet, and vector DB.

U_VLA Stats

uvla_id_stats.json was computed simulator-free from the 10fps DCT skill dataset.

  • R: 4
  • Aggregation: mean
  • Sigma: 0.5
  • Samples: 100
  • Mean: 255.6530
  • Std: 176.9175
  • P90: 535.7004
  • P95: 559.8982
  • P99: 619.3958

Notes

  • IsaacLab simulator is not required to load this checkpoint or validate Phase2 prepare artifacts.
  • The simulator is only needed when actually running Phase2 collection.
  • Optimizer and scheduler training_state/ files are intentionally excluded because Phase2 inference does not need them.
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Dataset used to train CoRL2026-CSI/IsaacLab-SO101-Phase2-DCT-SmolVLA-pick_place-80episode-10fps