Avala + LeRobot - Avala Documentation

Convert an Avala sequence dataset directly into a Hugging Face LeRobot v3 dataset — the de-facto standard format for open robot-learning. The converter reads sequences and frames straight through the SDK (no exports.create, no archive download) and writes a LeRobot dataset on disk, optionally pushing it to the Hub. One Avala sequence becomes one LeRobot episode. Each camera maps to an observation.images.<cam> feature; timestamp/frame_index/episode_index are derived from --fps.

Prerequisites

pip install "avala[lerobot]"

The lerobot library requires Python 3.12+. This extra is unusable on 3.9–3.11. Video encoding (--no-video off) additionally pulls av/torchcodec; use --no-video to store frames as images and skip that stack.

Convert from the CLI

avala lerobot export my-org/my-dataset \
  --repo-id my-hf-user/my-dataset \
  --output ./lerobot-out \
  --fps 30 \
  --task "pick up the cube"

This walks every sequence in my-org/my-dataset, writes a LeRobot v3 dataset to ./lerobot-out, and finalizes it (so the parquet footers are written and the dataset is readable).

Flag	Purpose
`--repo-id`	Target dataset id, `<hf_user>/<name>` (required)
`--output`	Local output directory (required)
`--fps`	Frames per second; timestamps are synthesized as `frame_index / fps` (default 30)
`--task`	Instruction string attached to every frame
`--camera`	Restrict to specific camera(s) by positional name (`cam0`, `cam1`, …); repeatable. Default: all cameras on frame 0
`--state-key` / `--action-key`	Dotted path into the raw frame for a numeric `observation.state` / `action` vector (see below)
`--ego-pose-state`	Use the 7-dim camera-rig ego pose as `observation.state` (rig pose, not proprioception)
`--no-video`	Store frames as PNG image features instead of encoded video
`--limit`	Convert at most N sequences
`--push`	Push the result to the Hugging Face Hub (requires HF auth)
`--tag`	Extra dataset-card tag(s) for the Hub; repeatable. `avala` and `LeRobot` are always added
`--license`	License for the pushed dataset card (default: lerobot’s `apache-2.0`)

Convert from Python

from avala import Client
from avala.lerobot import export_dataset

export_dataset(
    Client(),
    "my-org",
    "my-dataset",
    repo_id="my-hf-user/my-dataset",
    output_dir="./lerobot-out",
    fps=30,
    task="pick up the cube",
)

The result is a standard LeRobot dataset:

from lerobot.datasets import LeRobotDataset

ds = LeRobotDataset("my-hf-user/my-dataset", root="./lerobot-out")
print(ds.num_episodes, ds.num_frames)

The output is standard LeRobot v3, so it also works with StreamingLeRobotDataset (train directly from the Hub with no full download) once pushed. When you --push, the dataset card is tagged LeRobot + robotics (by lerobot) and avala, so it shows up in the LeRobot dataset viewer and filters.

Perception vs. policy datasets (read this)

Avala sequence datasets are annotation-centric: they reliably provide camera frames and calibration, but not robot proprioception. By default this converter therefore produces a perception / vision-language dataset (cameras + timestamps + a task string) and prints a warning saying so. That is a valid LeRobot dataset, but it is not a policy-training dataset — it has no action/observation.state. To produce robot observation.state / action, the source frames must actually carry that data, and you point the converter at it:

# When the raw frame dicts embed numeric vectors (customer-specific schema):
avala lerobot export my-org/my-dataset --repo-id u/d --output ./out \
  --state-key observation.joint_positions \
  --action-key action

# Or use the capture-rig ego pose as observation.state (labelled as rig pose):
avala lerobot export my-org/my-dataset --repo-id u/d --output ./out --ego-pose-state

State/action are all-or-nothing: a configured key that is missing or non-numeric on any frame is an error — the converter never fabricates zeros.

Next Steps

Avala + PyTorch — stream Avala data into PyTorch with no export step
Python SDK reference
LeRobot — the robot-learning library and dataset format

​Prerequisites

​Convert from the CLI

​Convert from Python

​Perception vs. policy datasets (read this)

​Next Steps

Prerequisites

Convert from the CLI

Convert from Python

Perception vs. policy datasets (read this)

Next Steps