Skip to main content
Convert an Avala sequence dataset directly into a Hugging Face LeRobot v3 dataset — the de-facto standard format for open robot-learning. The converter reads sequences and frames straight through the SDK (no exports.create, no archive download) and writes a LeRobot dataset on disk, optionally pushing it to the Hub. One Avala sequence becomes one LeRobot episode. Each camera maps to an observation.images.<cam> feature; timestamp/frame_index/episode_index are derived from --fps.

Prerequisites

pip install "avala[lerobot]"
The lerobot library requires Python 3.12+. This extra is unusable on 3.9–3.11. Video encoding (--no-video off) additionally pulls av/torchcodec; use --no-video to store frames as images and skip that stack.

Convert from the CLI

avala lerobot export my-org/my-dataset \
  --repo-id my-hf-user/my-dataset \
  --output ./lerobot-out \
  --fps 30 \
  --task "pick up the cube"
This walks every sequence in my-org/my-dataset, writes a LeRobot v3 dataset to ./lerobot-out, and finalizes it (so the parquet footers are written and the dataset is readable).
FlagPurpose
--repo-idTarget dataset id, <hf_user>/<name> (required)
--outputLocal output directory (required)
--fpsFrames per second; timestamps are synthesized as frame_index / fps (default 30)
--taskInstruction string attached to every frame
--cameraRestrict to specific camera(s) by positional name (cam0, cam1, …); repeatable. Default: all cameras on frame 0
--state-key / --action-keyDotted path into the raw frame for a numeric observation.state / action vector (see below)
--ego-pose-stateUse the 7-dim camera-rig ego pose as observation.state (rig pose, not proprioception)
--no-videoStore frames as PNG image features instead of encoded video
--limitConvert at most N sequences
--pushPush the result to the Hugging Face Hub (requires HF auth)
--tagExtra dataset-card tag(s) for the Hub; repeatable. avala and LeRobot are always added
--licenseLicense for the pushed dataset card (default: lerobot’s apache-2.0)

Convert from Python

from avala import Client
from avala.lerobot import export_dataset

export_dataset(
    Client(),
    "my-org",
    "my-dataset",
    repo_id="my-hf-user/my-dataset",
    output_dir="./lerobot-out",
    fps=30,
    task="pick up the cube",
)
The result is a standard LeRobot dataset:
from lerobot.datasets import LeRobotDataset

ds = LeRobotDataset("my-hf-user/my-dataset", root="./lerobot-out")
print(ds.num_episodes, ds.num_frames)
The output is standard LeRobot v3, so it also works with StreamingLeRobotDataset (train directly from the Hub with no full download) once pushed. When you --push, the dataset card is tagged LeRobot + robotics (by lerobot) and avala, so it shows up in the LeRobot dataset viewer and filters.

Perception vs. policy datasets (read this)

Avala sequence datasets are annotation-centric: they reliably provide camera frames and calibration, but not robot proprioception. By default this converter therefore produces a perception / vision-language dataset (cameras + timestamps + a task string) and prints a warning saying so. That is a valid LeRobot dataset, but it is not a policy-training dataset — it has no action/observation.state. To produce robot observation.state / action, the source frames must actually carry that data, and you point the converter at it:
# When the raw frame dicts embed numeric vectors (customer-specific schema):
avala lerobot export my-org/my-dataset --repo-id u/d --output ./out \
  --state-key observation.joint_positions \
  --action-key action

# Or use the capture-rig ego pose as observation.state (labelled as rig pose):
avala lerobot export my-org/my-dataset --repo-id u/d --output ./out --ego-pose-state
State/action are all-or-nothing: a configured key that is missing or non-numeric on any frame is an error — the converter never fabricates zeros.

Next Steps