Skip to main content
Create, list, and retrieve datasets and their sequences.

Create Dataset

POST /api/v1/datasets/
Creates a new dataset for annotation. You can optionally attach a cloud storage provider configuration to back the dataset with S3 or GCS data.

Request Body

FieldTypeRequiredDescription
namestringYesDisplay name for the dataset
slugstringYesURL-friendly identifier
data_typestringYesType of data: image, video, lidar, mcap, splat, or image_3d
is_sequencebooleanNoWhether the dataset contains sequences (default: false)
visibilitystringNoprivate or public (default: private)
create_metadatabooleanNoWhether to create dataset metadata (default: true)
provider_configobjectNoCloud storage provider configuration (see below)
owner_namestringNoDataset owner username or email

Provider Config (S3)

FieldTypeDescription
providerstringaws_s3
s3_bucket_namestringS3 bucket name
s3_bucket_regionstringAWS region
s3_bucket_prefixstringKey prefix for dataset files
s3_access_key_idstringAWS access key ID
s3_secret_access_keystringAWS secret access key
s3_is_acceleratedbooleanEnable S3 Transfer Acceleration

Request

curl -X POST "https://api.avala.ai/api/v1/datasets/" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "LiDAR Captures Q1",
    "slug": "lidar-captures-q1",
    "data_type": "lidar",
    "is_sequence": true,
    "visibility": "private",
    "provider_config": {
      "provider": "aws_s3",
      "s3_bucket_name": "my-datasets",
      "s3_bucket_region": "us-east-1",
      "s3_bucket_prefix": "captures/q1/",
      "s3_access_key_id": "AKIA...",
      "s3_secret_access_key": "your-secret-key"
    }
  }'

Response

{
  "uid": "550e8400-e29b-41d4-a716-446655440000",
  "name": "LiDAR Captures Q1",
  "slug": "lidar-captures-q1",
  "data_type": "lidar",
  "is_sequence": true,
  "visibility": "private",
  "status": "creating",
  "item_count": 0,
  "project_count": 0,
  "owner_name": "johndoe",
  "size_bytes": 0,
  "annotations_count": 0
}

List Datasets

GET /api/v1/datasets/{owner_name}/list/
Returns all user-owned datasets belonging to a specific owner that are visible to the authenticated user.
This endpoint returns datasets owned directly by the user. Organization-owned datasets are not included — use List Organization Datasets instead.

Parameters

NameTypeRequiredDescription
owner_namestringYesUsername of the dataset owner (path parameter)
data_typestringNoFilter by data type: image, video, lidar, mcap, splat, or image_3d (query parameter)
namestringNoFilter by name (case-insensitive substring match) (query parameter)
statusstringNoFilter by dataset status (query parameter)
visibilitystringNoFilter by visibility: public or private (query parameter)
orderingstringNoField to order results by (query parameter)
pageintegerNoPage number for pagination (query parameter)
limitintegerNoNumber of results per page (query parameter)

Request

curl "https://api.avala.ai/api/v1/datasets/johndoe/list/" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY"

Response

{
  "count": 25,
  "next": "https://api.avala.ai/api/v1/datasets/johndoe/list/?page=2",
  "previous": null,
  "results": [
    {
      "uid": "550e8400-e29b-41d4-a716-446655440000",
      "name": "Training Images",
      "slug": "training-images",
      "data_type": "image",
      "is_sequence": false,
      "visibility": "private",
      "status": "created",
      "item_count": 1000,
      "project_count": 2,
      "owner_name": "johndoe",
      "size_bytes": 5368709120,
      "annotations_count": 4500
    }
  ]
}

Fields

FieldTypeDescription
uidstring (UUID)Unique identifier for the dataset
namestringDisplay name of the dataset
slugstringURL-friendly identifier
data_typestringType of data in the dataset (image, video, lidar, mcap, splat, or image_3d)
is_sequencebooleanWhether the dataset contains sequences
visibilitystringpublic or private
statusstringCurrent dataset status (creating or created)
item_countintegerNumber of items in the dataset
project_countintegerNumber of projects associated with the dataset
owner_namestringUsername of the dataset owner
size_bytesintegerTotal size of the dataset in bytes
annotations_countintegerTotal number of annotations across all items

List Organization Datasets

GET /api/v1/organizations/{org_slug}/datasets/
Returns datasets owned by an organization. Only available to organization members and staff.
If your datasets belong to an organization, use this endpoint instead of the user-scoped List Datasets endpoint.

Parameters

NameTypeRequiredDescription
org_slugstringYesSlug identifier of the organization (path parameter)
data_typestringNoFilter by data type: image, video, lidar, mcap, splat, or image_3d (query parameter)
namestringNoFilter by name (case-insensitive substring match) (query parameter)
statusstringNoFilter by dataset status (query parameter)
visibilitystringNoFilter by visibility: public or private (query parameter)
searchstringNoSearch by name or slug (query parameter)
orderingstringNoField to order results by (query parameter)
pageintegerNoPage number for pagination (query parameter)

Request

curl "https://api.avala.ai/api/v1/organizations/my-org/datasets/?data_type=lidar" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY"

Response

Same format as List Datasets.

List Sequences (Single Dataset)

GET /api/v1/datasets/{owner_name}/{dataset_slug}/sequences/
Returns sequences within a single dataset. Used for video and LiDAR datasets that contain frame sequences.

Parameters

NameTypeRequiredDescription
owner_namestringYesUsername of the dataset owner (path parameter)
dataset_slugstringYesSlug identifier of the dataset (path parameter)
orderingstringNoField to order results by (query parameter)
cursorstringNoCursor for pagination (query parameter)
limitintegerNoNumber of results per page (query parameter)

Request

curl "https://api.avala.ai/api/v1/datasets/johndoe/lidar-captures/sequences/" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY"

Response

{
  "next": null,
  "previous": null,
  "results": [
    {
      "uid": "660f9500-f39c-52e5-b827-557766550000",
      "key": "sequence_001",
      "status": "completed",
      "featured_image": "https://storage.avala.ai/sequences/seq_001/featured.jpg",
      "number_of_frames": 150,
      "views": [
        {
          "key": "camera_front",
          "load": "https://storage.avala.ai/sequences/seq_001/camera_front/",
          "metrics": null
        }
      ]
    }
  ]
}

Fields

FieldTypeDescription
uidstring (UUID)Unique identifier for the sequence
keystringSequence key name
statusstringCurrent workflow status of the sequence
featured_imagestringURL to the featured preview image
number_of_framesintegerTotal number of frames in the sequence
viewsarrayArray of view objects, each containing key, load, and metrics

List Sequences (Cross-Dataset)

GET /api/v1/datasets/{owner_name}/sequences/
Returns sequences across all datasets belonging to an owner. Supports filtering by dataset slug(s) and status, making it ideal for bulk QC status checks without per-dataset API calls.

Parameters

NameTypeRequiredDescription
owner_namestringYesUsername of the dataset owner (path parameter)
statusstringNoFilter by sequence status (query parameter)
status__instringNoComma-separated list of statuses to filter by (query parameter)
dataset__slugstringNoFilter sequences by a single dataset slug (query parameter)
dataset__slug__instringNoComma-separated list of dataset slugs to filter by (query parameter)
pageintegerNoPage number for pagination (query parameter)

Request

# Get sequences for specific batches
curl "https://api.avala.ai/api/v1/datasets/johndoe/sequences/?dataset__slug__in=batch-001,batch-002,batch-003" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY"

# Filter by status
curl "https://api.avala.ai/api/v1/datasets/johndoe/sequences/?status__in=customer_review,rework_requested" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY"

# Combine filters
curl "https://api.avala.ai/api/v1/datasets/johndoe/sequences/?dataset__slug__in=batch-001,batch-002&status=customer_approved" \
  -H "X-Avala-Api-Key: $AVALA_API_KEY"
Use this endpoint instead of making separate calls to /datasets/{owner}/{slug}/sequences/ for each dataset. This avoids rate limiting when checking QC status across many batches.

Data Types

TypeDescription
imageSingle images (JPEG, PNG, WebP, BMP)
videoVideo files converted to frame sequences
lidarPoint cloud data (PCD, PLY)
mcapMCAP files with sensor data
splatGaussian Splat 3D scene reconstructions
image_3d3D image data

Dataset Status

StatusDescription
creatingDataset is being created and is not yet ready
createdDataset has been created and is ready for use

Sequence Status Values

Sequences progress through various workflow statuses during the annotation lifecycle.
StatusDescription
unattemptedNot yet started
pendingAwaiting processing
completedFully annotated and reviewed
rework_requiredNeeds corrections
ready_for_annotationReady to be annotated
labeling_4d3D/4D annotation in progress
review_4d3D/4D annotation review in progress
ready_for_2dReady for 2D annotation
labeling_2d2D annotation in progress
review_2d2D annotation review in progress
final_reviewFinal quality control review
customer_approvedApproved by the customer

Error Responses

Not Found (404)

{
  "detail": "Not found."
}
Returned when the specified owner or dataset does not exist.

Permission Denied (403)

{
  "detail": "You do not have permission to perform this action."
}
Returned when the authenticated user does not have access to the requested dataset.

Unauthorized (401)

{
  "detail": "Invalid API key."
}
Returned when the X-Avala-Api-Key header is missing or contains an invalid key.