Create, list, and retrieve datasets and their sequences.
Create Dataset
Creates a new dataset for annotation. You can optionally attach a cloud storage provider configuration to back the dataset with S3 or GCS data.
Request Body
Field Type Required Description namestring Yes Display name for the dataset slugstring Yes URL-friendly identifier data_typestring Yes Type of data: image, video, lidar, mcap, splat, or image_3d is_sequenceboolean No Whether the dataset contains sequences (default: false) visibilitystring No private or public (default: private)create_metadataboolean No Whether to create dataset metadata (default: true) provider_configobject No Cloud storage provider configuration (see below) owner_namestring No Dataset owner username or email
Provider Config (S3)
Field Type Description providerstring aws_s3s3_bucket_namestring S3 bucket name s3_bucket_regionstring AWS region s3_bucket_prefixstring Key prefix for dataset files s3_access_key_idstring AWS access key ID s3_secret_access_keystring AWS secret access key s3_is_acceleratedboolean Enable S3 Transfer Acceleration
Request
cURL
Python SDK
TypeScript SDK
CLI
curl -X POST "https://api.avala.ai/api/v1/datasets/" \
-H "X-Avala-Api-Key: $AVALA_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"name": "LiDAR Captures Q1",
"slug": "lidar-captures-q1",
"data_type": "lidar",
"is_sequence": true,
"visibility": "private",
"provider_config": {
"provider": "aws_s3",
"s3_bucket_name": "my-datasets",
"s3_bucket_region": "us-east-1",
"s3_bucket_prefix": "captures/q1/",
"s3_access_key_id": "AKIA...",
"s3_secret_access_key": "your-secret-key"
}
}'
Response
{
"uid" : "550e8400-e29b-41d4-a716-446655440000" ,
"name" : "LiDAR Captures Q1" ,
"slug" : "lidar-captures-q1" ,
"data_type" : "lidar" ,
"is_sequence" : true ,
"visibility" : "private" ,
"status" : "creating" ,
"item_count" : 0 ,
"project_count" : 0 ,
"owner_name" : "johndoe" ,
"size_bytes" : 0 ,
"annotations_count" : 0
}
List Datasets
GET /api/v1/datasets/{owner_name}/list/
Returns all user-owned datasets belonging to a specific owner that are visible to the authenticated user.
This endpoint returns datasets owned directly by the user. Organization-owned datasets are not included — use List Organization Datasets instead.
Parameters
Name Type Required Description owner_namestring Yes Username of the dataset owner (path parameter) data_typestring No Filter by data type: image, video, lidar, mcap, splat, or image_3d (query parameter) namestring No Filter by name (case-insensitive substring match) (query parameter) statusstring No Filter by dataset status (query parameter) visibilitystring No Filter by visibility: public or private (query parameter) orderingstring No Field to order results by (query parameter) pageinteger No Page number for pagination (query parameter) limitinteger No Number of results per page (query parameter)
Request
curl "https://api.avala.ai/api/v1/datasets/johndoe/list/" \
-H "X-Avala-Api-Key: $AVALA_API_KEY "
Response
{
"count" : 25 ,
"next" : "https://api.avala.ai/api/v1/datasets/johndoe/list/?page=2" ,
"previous" : null ,
"results" : [
{
"uid" : "550e8400-e29b-41d4-a716-446655440000" ,
"name" : "Training Images" ,
"slug" : "training-images" ,
"data_type" : "image" ,
"is_sequence" : false ,
"visibility" : "private" ,
"status" : "created" ,
"item_count" : 1000 ,
"project_count" : 2 ,
"owner_name" : "johndoe" ,
"size_bytes" : 5368709120 ,
"annotations_count" : 4500
}
]
}
Fields
Field Type Description uidstring (UUID) Unique identifier for the dataset namestring Display name of the dataset slugstring URL-friendly identifier data_typestring Type of data in the dataset (image, video, lidar, mcap, splat, or image_3d) is_sequenceboolean Whether the dataset contains sequences visibilitystring public or privatestatusstring Current dataset status (creating or created) item_countinteger Number of items in the dataset project_countinteger Number of projects associated with the dataset owner_namestring Username of the dataset owner size_bytesinteger Total size of the dataset in bytes annotations_countinteger Total number of annotations across all items
List Organization Datasets
GET /api/v1/organizations/{org_slug}/datasets/
Returns datasets owned by an organization. Only available to organization members and staff.
If your datasets belong to an organization, use this endpoint instead of the user-scoped List Datasets endpoint.
Parameters
Name Type Required Description org_slugstring Yes Slug identifier of the organization (path parameter) data_typestring No Filter by data type: image, video, lidar, mcap, splat, or image_3d (query parameter) namestring No Filter by name (case-insensitive substring match) (query parameter) statusstring No Filter by dataset status (query parameter) visibilitystring No Filter by visibility: public or private (query parameter) searchstring No Search by name or slug (query parameter) orderingstring No Field to order results by (query parameter) pageinteger No Page number for pagination (query parameter)
Request
curl "https://api.avala.ai/api/v1/organizations/my-org/datasets/?data_type=lidar" \
-H "X-Avala-Api-Key: $AVALA_API_KEY "
Response
Same format as List Datasets .
List Sequences (Single Dataset)
GET /api/v1/datasets/{owner_name}/{dataset_slug}/sequences/
Returns sequences within a single dataset. Used for video and LiDAR datasets that contain frame sequences.
Parameters
Name Type Required Description owner_namestring Yes Username of the dataset owner (path parameter) dataset_slugstring Yes Slug identifier of the dataset (path parameter) orderingstring No Field to order results by (query parameter) cursorstring No Cursor for pagination (query parameter) limitinteger No Number of results per page (query parameter)
Request
curl "https://api.avala.ai/api/v1/datasets/johndoe/lidar-captures/sequences/" \
-H "X-Avala-Api-Key: $AVALA_API_KEY "
Response
{
"next" : null ,
"previous" : null ,
"results" : [
{
"uid" : "660f9500-f39c-52e5-b827-557766550000" ,
"key" : "sequence_001" ,
"status" : "completed" ,
"featured_image" : "https://storage.avala.ai/sequences/seq_001/featured.jpg" ,
"number_of_frames" : 150 ,
"views" : [
{
"key" : "camera_front" ,
"load" : "https://storage.avala.ai/sequences/seq_001/camera_front/" ,
"metrics" : null
}
]
}
]
}
Fields
Field Type Description uidstring (UUID) Unique identifier for the sequence keystring Sequence key name statusstring Current workflow status of the sequence featured_imagestring URL to the featured preview image number_of_framesinteger Total number of frames in the sequence viewsarray Array of view objects, each containing key, load, and metrics
List Sequences (Cross-Dataset)
GET /api/v1/datasets/{owner_name}/sequences/
Returns sequences across all datasets belonging to an owner. Supports filtering by dataset slug(s) and status, making it ideal for bulk QC status checks without per-dataset API calls.
Parameters
Name Type Required Description owner_namestring Yes Username of the dataset owner (path parameter) statusstring No Filter by sequence status (query parameter) status__instring No Comma-separated list of statuses to filter by (query parameter) dataset__slugstring No Filter sequences by a single dataset slug (query parameter) dataset__slug__instring No Comma-separated list of dataset slugs to filter by (query parameter) pageinteger No Page number for pagination (query parameter)
Request
# Get sequences for specific batches
curl "https://api.avala.ai/api/v1/datasets/johndoe/sequences/?dataset__slug__in=batch-001,batch-002,batch-003" \
-H "X-Avala-Api-Key: $AVALA_API_KEY "
# Filter by status
curl "https://api.avala.ai/api/v1/datasets/johndoe/sequences/?status__in=customer_review,rework_requested" \
-H "X-Avala-Api-Key: $AVALA_API_KEY "
# Combine filters
curl "https://api.avala.ai/api/v1/datasets/johndoe/sequences/?dataset__slug__in=batch-001,batch-002&status=customer_approved" \
-H "X-Avala-Api-Key: $AVALA_API_KEY "
Use this endpoint instead of making separate calls to /datasets/{owner}/{slug}/sequences/ for each dataset. This avoids rate limiting when checking QC status across many batches.
Data Types
Type Description imageSingle images (JPEG, PNG, WebP, BMP) videoVideo files converted to frame sequences lidarPoint cloud data (PCD, PLY) mcapMCAP files with sensor data splatGaussian Splat 3D scene reconstructions image_3d3D image data
Dataset Status
Status Description creatingDataset is being created and is not yet ready createdDataset has been created and is ready for use
Sequence Status Values
Sequences progress through various workflow statuses during the annotation lifecycle.
Status Description unattemptedNot yet started pendingAwaiting processing completedFully annotated and reviewed rework_requiredNeeds corrections ready_for_annotationReady to be annotated labeling_4d3D/4D annotation in progress review_4d3D/4D annotation review in progress ready_for_2dReady for 2D annotation labeling_2d2D annotation in progress review_2d2D annotation review in progress final_reviewFinal quality control review customer_approvedApproved by the customer
Error Responses
Not Found (404)
{
"detail" : "Not found."
}
Returned when the specified owner or dataset does not exist.
Permission Denied (403)
{
"detail" : "You do not have permission to perform this action."
}
Returned when the authenticated user does not have access to the requested dataset.
Unauthorized (401)
{
"detail" : "Invalid API key."
}
Returned when the X-Avala-Api-Key header is missing or contains an invalid key.