Instructions to use CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep - Notebooks
- Google Colab
- Kaggle
Fix formatting and expand training hyperparameters
Browse files
README.md
CHANGED
|
@@ -8,34 +8,33 @@ tags:
|
|
| 8 |
- smolvla
|
| 9 |
- robotics
|
| 10 |
- ur7e
|
| 11 |
-
- ur7e
|
| 12 |
- code-as-policies
|
| 13 |
- imitation-learning
|
| 14 |
- CoRL2026
|
| 15 |
---
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
|
| 31 |
-
|
| 32 |
|
| 33 |
-
|
| 34 |
- [10ep checkpoint](https://huggingface.co/CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep)
|
| 35 |
|
| 36 |
-
|
| 37 |
|
| 38 |
-
|
| 39 |
|---|---|
|
| 40 |
| `Robot` | UR7e |
|
| 41 |
| `Episodes` | 100 |
|
|
@@ -45,11 +44,11 @@ tags:
|
|
| 45 |
| `Camera streams` | `observation.images.realsense_wrist`, `observation.images.realsense_topview` |
|
| 46 |
| `Dataset state/action shape` | [7] / [7] |
|
| 47 |
|
| 48 |
-
|
| 49 |
|
| 50 |
-
|
| 51 |
|
| 52 |
-
|
| 53 |
|---|---|
|
| 54 |
| `script` | lerobot/scripts/train_smolvla_ur7e.sh |
|
| 55 |
| `job_name` | smolvla_ur7e_arrange_block_100epi_bs64_acc4_ep10_20260509_130552 |
|
|
@@ -63,18 +62,18 @@ tags:
|
|
| 63 |
| `checkpoint_lr` | 2.5e-06 |
|
| 64 |
| `effective_batch` | 64 x 1 x 4 = 256 |
|
| 65 |
|
| 66 |
-
|
| 67 |
|
| 68 |
-
|
| 69 |
-
|
| 70 |
CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DATASET_REPO_ID="CoRL2026-CSI/UR7e-CaP_arrange_block_100epi" BATCH_SIZE="64" GRADIENT_ACCUMULATION_STEPS="4" STEPS="5520" NUM_WORKERS="4" DATALOADER_PREFETCH_FACTOR="1" CUDA_VISIBLE_DEVICES="0" NUM_GPUS="1" MIXED_PRECISION="bf16" SAVE_FREQ="2760" LOG_FREQ="10" EVAL_FREQ="0" WANDB_PROJECT="lerobot-smolvla-ur7e" OMP_NUM_THREADS="4" MKL_NUM_THREADS="4" PYTORCH_CUDA_ALLOC_CONF="expandable_segments:True" bash train_smolvla_ur7e.sh
|
| 71 |
-
|
| 72 |
|
| 73 |
-
|
| 74 |
|
| 75 |
-
|
| 76 |
|
| 77 |
-
|
| 78 |
|---|---|
|
| 79 |
| `CONDA_ENV` | lerobot |
|
| 80 |
| `POLICY_TYPE` | smolvla |
|
|
@@ -96,9 +95,9 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 96 |
| `MKL_NUM_THREADS` | 4 |
|
| 97 |
| `PYTORCH_CUDA_ALLOC_CONF` | expandable_segments:True |
|
| 98 |
|
| 99 |
-
|
| 100 |
|
| 101 |
-
|
| 102 |
|---|---|
|
| 103 |
| `steps` | 5520 |
|
| 104 |
| `batch_size` | 64 |
|
|
@@ -115,9 +114,9 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 115 |
| `ddp_find_unused_parameters` | True |
|
| 116 |
| `profile_timing` | False |
|
| 117 |
|
| 118 |
-
|
| 119 |
|
| 120 |
-
|
| 121 |
|---|---|
|
| 122 |
| `dataset.repo_id` | CoRL2026-CSI/UR7e-CaP_arrange_block_100epi |
|
| 123 |
| `dataset.root` | `null` |
|
|
@@ -127,9 +126,9 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 127 |
| `dataset.video_backend` | torchcodec |
|
| 128 |
| `dataset.streaming` | False |
|
| 129 |
|
| 130 |
-
|
| 131 |
|
| 132 |
-
|
| 133 |
{
|
| 134 |
"enable": true,
|
| 135 |
"max_num_transforms": 2,
|
|
@@ -203,18 +202,18 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 203 |
}
|
| 204 |
```
|
| 205 |
|
| 206 |
-
|
| 207 |
|
| 208 |
-
|
| 209 |
{
|
| 210 |
"observation.images.realsense_wrist": "observation.images.camera1",
|
| 211 |
"observation.images.realsense_topview": "observation.images.camera2"
|
| 212 |
}
|
| 213 |
```
|
| 214 |
|
| 215 |
-
|
| 216 |
|
| 217 |
-
|
| 218 |
{
|
| 219 |
"type": "smolvla",
|
| 220 |
"pretrained_path": "lerobot/smolvla_base",
|
|
@@ -294,9 +293,9 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 294 |
}
|
| 295 |
```
|
| 296 |
|
| 297 |
-
|
| 298 |
|
| 299 |
-
|
| 300 |
{
|
| 301 |
"type": "adamw",
|
| 302 |
"lr": 0.0001,
|
|
@@ -310,9 +309,9 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 310 |
}
|
| 311 |
```
|
| 312 |
|
| 313 |
-
|
| 314 |
|
| 315 |
-
|
| 316 |
{
|
| 317 |
"type": "cosine_decay_with_warmup",
|
| 318 |
"num_warmup_steps": 1000,
|
|
@@ -322,9 +321,9 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 322 |
}
|
| 323 |
```
|
| 324 |
|
| 325 |
-
|
| 326 |
|
| 327 |
-
|
| 328 |
{
|
| 329 |
"enable": true,
|
| 330 |
"disable_artifact": false,
|
|
@@ -336,25 +335,25 @@ CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DAT
|
|
| 336 |
}
|
| 337 |
```
|
| 338 |
|
| 339 |
-
|
| 340 |
|
| 341 |
-
|
| 342 |
|
| 343 |
-
|
| 344 |
-
|
| 345 |
-
|
| 346 |
-
|
| 347 |
|
| 348 |
-
|
| 349 |
|
| 350 |
-
|
| 351 |
|
| 352 |
-
|
| 353 |
|
| 354 |
-
|
| 355 |
|
| 356 |
-
|
| 357 |
|
| 358 |
-
|
| 359 |
-
|
| 360 |
-
|
|
|
|
| 8 |
- smolvla
|
| 9 |
- robotics
|
| 10 |
- ur7e
|
|
|
|
| 11 |
- code-as-policies
|
| 12 |
- imitation-learning
|
| 13 |
- CoRL2026
|
| 14 |
---
|
| 15 |
|
| 16 |
+
# SmolVLA UR7e Arrange Block 100epi (10 epochs)
|
| 17 |
|
| 18 |
+
This repository contains a SmolVLA policy checkpoint fine-tuned with LeRobot. The model card is intentionally detailed so the training run can be reproduced or debugged from the uploaded artifact.
|
| 19 |
|
| 20 |
+
## Model Details
|
| 21 |
|
| 22 |
+
- **Policy:** SmolVLA
|
| 23 |
+
- **Base checkpoint:** [`lerobot/smolvla_base`](https://huggingface.co/lerobot/smolvla_base)
|
| 24 |
+
- **Training dataset:** [`CoRL2026-CSI/UR7e-CaP_arrange_block_100epi`](https://huggingface.co/datasets/CoRL2026-CSI/UR7e-CaP_arrange_block_100epi)
|
| 25 |
+
- **Training script:** `lerobot/scripts/train_smolvla_ur7e.sh`
|
| 26 |
+
- **Checkpoint:** step `5520`, approximately `10.00` epochs
|
| 27 |
+
- **Reported training loss at checkpoint:** `0.009`
|
| 28 |
+
- **Resolved config:** [`train_config.json`](train_config.json)
|
| 29 |
|
| 30 |
+
Related checkpoints from the same run:
|
| 31 |
|
| 32 |
+
- [5ep checkpoint](https://huggingface.co/CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_5ep)
|
| 33 |
- [10ep checkpoint](https://huggingface.co/CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep)
|
| 34 |
|
| 35 |
+
## Dataset
|
| 36 |
|
| 37 |
+
| Key | Value |
|
| 38 |
|---|---|
|
| 39 |
| `Robot` | UR7e |
|
| 40 |
| `Episodes` | 100 |
|
|
|
|
| 44 |
| `Camera streams` | `observation.images.realsense_wrist`, `observation.images.realsense_topview` |
|
| 45 |
| `Dataset state/action shape` | [7] / [7] |
|
| 46 |
|
| 47 |
+
## Reproduction
|
| 48 |
|
| 49 |
+
The uploaded [`train_config.json`](train_config.json) is the authoritative serialized LeRobot config for this checkpoint. The table below mirrors the key values for quick inspection.
|
| 50 |
|
| 51 |
+
| Key | Value |
|
| 52 |
|---|---|
|
| 53 |
| `script` | lerobot/scripts/train_smolvla_ur7e.sh |
|
| 54 |
| `job_name` | smolvla_ur7e_arrange_block_100epi_bs64_acc4_ep10_20260509_130552 |
|
|
|
|
| 62 |
| `checkpoint_lr` | 2.5e-06 |
|
| 63 |
| `effective_batch` | 64 x 1 x 4 = 256 |
|
| 64 |
|
| 65 |
+
Approximate script invocation:
|
| 66 |
|
| 67 |
+
```bash
|
| 68 |
+
cd /home/work/hscho/corl_2026/AutoDataCollector/lerobot
|
| 69 |
CONDA_ENV="lerobot" POLICY_TYPE="smolvla" POLICY_PATH="lerobot/smolvla_base" DATASET_REPO_ID="CoRL2026-CSI/UR7e-CaP_arrange_block_100epi" BATCH_SIZE="64" GRADIENT_ACCUMULATION_STEPS="4" STEPS="5520" NUM_WORKERS="4" DATALOADER_PREFETCH_FACTOR="1" CUDA_VISIBLE_DEVICES="0" NUM_GPUS="1" MIXED_PRECISION="bf16" SAVE_FREQ="2760" LOG_FREQ="10" EVAL_FREQ="0" WANDB_PROJECT="lerobot-smolvla-ur7e" OMP_NUM_THREADS="4" MKL_NUM_THREADS="4" PYTORCH_CUDA_ALLOC_CONF="expandable_segments:True" bash train_smolvla_ur7e.sh
|
| 70 |
+
```
|
| 71 |
|
| 72 |
+
## Detailed Hyperparameters
|
| 73 |
|
| 74 |
+
### Script Defaults and Environment
|
| 75 |
|
| 76 |
+
| Key | Value |
|
| 77 |
|---|---|
|
| 78 |
| `CONDA_ENV` | lerobot |
|
| 79 |
| `POLICY_TYPE` | smolvla |
|
|
|
|
| 95 |
| `MKL_NUM_THREADS` | 4 |
|
| 96 |
| `PYTORCH_CUDA_ALLOC_CONF` | expandable_segments:True |
|
| 97 |
|
| 98 |
+
### Training Loop and Dataloader
|
| 99 |
|
| 100 |
+
| Key | Value |
|
| 101 |
|---|---|
|
| 102 |
| `steps` | 5520 |
|
| 103 |
| `batch_size` | 64 |
|
|
|
|
| 114 |
| `ddp_find_unused_parameters` | True |
|
| 115 |
| `profile_timing` | False |
|
| 116 |
|
| 117 |
+
### Dataset Pipeline
|
| 118 |
|
| 119 |
+
| Key | Value |
|
| 120 |
|---|---|
|
| 121 |
| `dataset.repo_id` | CoRL2026-CSI/UR7e-CaP_arrange_block_100epi |
|
| 122 |
| `dataset.root` | `null` |
|
|
|
|
| 126 |
| `dataset.video_backend` | torchcodec |
|
| 127 |
| `dataset.streaming` | False |
|
| 128 |
|
| 129 |
+
Image augmentation settings:
|
| 130 |
|
| 131 |
+
```json
|
| 132 |
{
|
| 133 |
"enable": true,
|
| 134 |
"max_num_transforms": 2,
|
|
|
|
| 202 |
}
|
| 203 |
```
|
| 204 |
|
| 205 |
+
Camera rename map:
|
| 206 |
|
| 207 |
+
```json
|
| 208 |
{
|
| 209 |
"observation.images.realsense_wrist": "observation.images.camera1",
|
| 210 |
"observation.images.realsense_topview": "observation.images.camera2"
|
| 211 |
}
|
| 212 |
```
|
| 213 |
|
| 214 |
+
### Policy Configuration
|
| 215 |
|
| 216 |
+
```json
|
| 217 |
{
|
| 218 |
"type": "smolvla",
|
| 219 |
"pretrained_path": "lerobot/smolvla_base",
|
|
|
|
| 293 |
}
|
| 294 |
```
|
| 295 |
|
| 296 |
+
### Optimizer
|
| 297 |
|
| 298 |
+
```json
|
| 299 |
{
|
| 300 |
"type": "adamw",
|
| 301 |
"lr": 0.0001,
|
|
|
|
| 309 |
}
|
| 310 |
```
|
| 311 |
|
| 312 |
+
### Scheduler
|
| 313 |
|
| 314 |
+
```json
|
| 315 |
{
|
| 316 |
"type": "cosine_decay_with_warmup",
|
| 317 |
"num_warmup_steps": 1000,
|
|
|
|
| 321 |
}
|
| 322 |
```
|
| 323 |
|
| 324 |
+
### Logging
|
| 325 |
|
| 326 |
+
```json
|
| 327 |
{
|
| 328 |
"enable": true,
|
| 329 |
"disable_artifact": false,
|
|
|
|
| 335 |
}
|
| 336 |
```
|
| 337 |
|
| 338 |
+
## Usage
|
| 339 |
|
| 340 |
+
Use this model as a LeRobot policy checkpoint:
|
| 341 |
|
| 342 |
+
```bash
|
| 343 |
+
python -m lerobot.scripts.lerobot_eval \
|
| 344 |
+
--policy.path=CoRL2026-CSI/smolvla_ur7e_arrange_block_100epi_10ep
|
| 345 |
+
```
|
| 346 |
|
| 347 |
+
For Python loading inside LeRobot code, use the SmolVLA policy loader with this repository id as the pretrained path.
|
| 348 |
|
| 349 |
+
## Evaluation and Limitations
|
| 350 |
|
| 351 |
+
This model card reports training checkpoint information only. No rollout success rate or task-level evaluation metric is included in this repository.
|
| 352 |
|
| 353 |
+
The checkpoint assumes a compatible observation/action schema and the camera remapping shown above. The optimizer/RNG `training_state` files are not included; only the loadable `pretrained_model` artifact is uploaded.
|
| 354 |
|
| 355 |
+
## Provenance
|
| 356 |
|
| 357 |
+
- VLM backbone: [`HuggingFaceTB/SmolVLM2-500M-Video-Instruct`](https://huggingface.co/HuggingFaceTB/SmolVLM2-500M-Video-Instruct)
|
| 358 |
+
- Fine-tuning run: `smolvla_ur7e_arrange_block_100epi_bs64_acc4_ep10_20260509_130552`
|
| 359 |
+
- Source training script: `lerobot/scripts/train_smolvla_ur7e.sh`
|