Clone with submodules to include LeRobot, LIBERO, and OpenVLA-OFT.
Set up a conda environment with Python 3.12 and install PyTorch with CUDA support.
Install LeRobot (model policies for X-VLA, SmolVLA, GR00T, Pi0.5), LIBERO (environments), and project packages.
Check that all model policies and LIBERO are available.
All experiments use a unified CLI. Swap --model to run on any supported VLA.
Zero each layer one at a time and measure task success rate to find critical components.
Apply 24 image corruptions (noise, blur, color, spatial, extreme) and measure robustness.
Test how models respond to null, negated, or wrong language instructions.
The full pipeline: collect activations, train SAEs, identify concepts, then verify causality.
Run baseline episodes and capture per-layer activations for SAE training.
Train sparse autoencoders on collected activations (8x expansion, k=64).
TORCH_COMPILE_DISABLE=1 for all experiment scripts.
Score features using Cohen's d to find task-selective SAE features.
Remove or amplify specific concept features during live rollouts to verify causality.
Action Atlas is deployed at action-atlas.com, but you can also run it locally to explore your own experiment data.
The Flask backend serves experiment data through a REST API.
The Next.js frontend provides the interactive visualization interface.
Open http://localhost:3002 to access Action Atlas locally.
Point Action Atlas at your experiment outputs by setting environment variables:
Browse UMAP scatter plots of SAE features, search by concept, and inspect individual feature activations across layers and suites.
Compare layer criticality, concept representations, and robustness across X-VLA, SmolVLA, GR00T, Pi0.5, and OpenVLA-OFT.
Compare baseline vs. ablated behavior, view success rate deltas per concept, and explore layer-phase ablation matrices.
Explore vision perturbation robustness results across 24 perturbation types and all supported models.