Convert an Avala sequence dataset directly into a Hugging Face LeRobot v3 dataset — the de-facto standard format for open robot-learning. The converter reads sequences and frames straight through the SDK (no exports.create, no archive download) and writes a LeRobot dataset on disk, optionally pushing it to the Hub.
One Avala sequence becomes one LeRobot episode. Each camera maps to an observation.images.<cam> feature; timestamp/frame_index/episode_index are derived from --fps.
Prerequisites
pip install "avala[lerobot]"
The lerobot library requires Python 3.12+. This extra is unusable on 3.9–3.11. Video encoding (--no-video off) additionally pulls av/torchcodec; use --no-video to store frames as images and skip that stack.
Convert from the CLI
avala lerobot export my-org/my-dataset \
--repo-id my-hf-user/my-dataset \
--output ./lerobot-out \
--fps 30 \
--task "pick up the cube"
This walks every sequence in my-org/my-dataset, writes a LeRobot v3 dataset to ./lerobot-out, and finalizes it (so the parquet footers are written and the dataset is readable).
| Flag | Purpose |
|---|
--repo-id | Target dataset id, <hf_user>/<name> (required) |
--output | Local output directory (required) |
--fps | Frames per second; timestamps are synthesized as frame_index / fps (default 30) |
--task | Instruction string attached to every frame |
--camera | Restrict to specific camera(s) by positional name (cam0, cam1, …); repeatable. Default: all cameras on frame 0 |
--state-key / --action-key | Dotted path into the raw frame for a numeric observation.state / action vector (see below) |
--ego-pose-state | Use the 7-dim camera-rig ego pose as observation.state (rig pose, not proprioception) |
--no-video | Store frames as PNG image features instead of encoded video |
--limit | Convert at most N sequences |
--push | Push the result to the Hugging Face Hub (requires HF auth) |
--tag | Extra dataset-card tag(s) for the Hub; repeatable. avala and LeRobot are always added |
--license | License for the pushed dataset card (default: lerobot’s apache-2.0) |
Convert from Python
from avala import Client
from avala.lerobot import export_dataset
export_dataset(
Client(),
"my-org",
"my-dataset",
repo_id="my-hf-user/my-dataset",
output_dir="./lerobot-out",
fps=30,
task="pick up the cube",
)
The result is a standard LeRobot dataset:
from lerobot.datasets import LeRobotDataset
ds = LeRobotDataset("my-hf-user/my-dataset", root="./lerobot-out")
print(ds.num_episodes, ds.num_frames)
The output is standard LeRobot v3, so it also works with StreamingLeRobotDataset (train directly from the Hub with no full download) once pushed. When you --push, the dataset card is tagged LeRobot + robotics (by lerobot) and avala, so it shows up in the LeRobot dataset viewer and filters.
Perception vs. policy datasets (read this)
Avala sequence datasets are annotation-centric: they reliably provide camera frames and calibration, but not robot proprioception. By default this converter therefore produces a perception / vision-language dataset (cameras + timestamps + a task string) and prints a warning saying so. That is a valid LeRobot dataset, but it is not a policy-training dataset — it has no action/observation.state.
To produce robot observation.state / action, the source frames must actually carry that data, and you point the converter at it:
# When the raw frame dicts embed numeric vectors (customer-specific schema):
avala lerobot export my-org/my-dataset --repo-id u/d --output ./out \
--state-key observation.joint_positions \
--action-key action
# Or use the capture-rig ego pose as observation.state (labelled as rig pose):
avala lerobot export my-org/my-dataset --repo-id u/d --output ./out --ego-pose-state
State/action are all-or-nothing: a configured key that is missing or non-numeric on any frame is an error — the converter never fabricates zeros.
Next Steps