The Ground-Truth Loop
Collect, visualize, annotate, curate, train, deploy — then route every failure back to the data that caused it. The loop is the product.
Physical AI improves the way a self-driving stack improved: not by a cleverer algorithm alone, but by a flywheel. Every mile driven, every grasp attempted, every failure observed becomes training signal for the next model, which collects better data, which trains a better model. The team with the fastest loop wins.
Most data tooling only covers one arc of that loop. A viewer shows you the data. A storage layer holds it. A training framework consumes it. Between those pieces sits the work that actually determines model quality — producing trustworthy labels — and it's left to you and a pile of glue scripts.
Avala is built around the entire loop, with the missing stage in the middle.
The six stages
1. Ingest. Raw fleet data lands — MCAP, ROS 1/2 bags, video, images, LiDAR. Avala converts and indexes it so it's queryable, not just stored.
2. Visualize. The recording becomes a synchronized 4D scene you can scrub in the browser: GPU-accelerated point clouds, multi-camera playback, Gaussian-splat reconstructions — every modality on one timeline.
3. Annotate. Models trained on your data auto-label the bulk of it; human experts verify the hard cases through consensus. The output is deterministic 4D ground truth, not best-effort guesses. This is the stage that makes the loop a loop.
4. Curate. Datasets are versioned, queryable, and traceable. Slice by scenario, by edge case, by confidence. Every label carries lineage back to who made it, why, and under which policy.
5. Train. Verified datasets flow into your training pipeline. You train on ground truth, not raw sensor dumps.
6. Deploy & close the loop. The model ships. The fleet runs. Anomalies and failures are captured and routed straight back to stage 3 — auto-labeled, verified, and folded into the next training run.
Why the middle stage is the moat
Strip annotation and verification out of the loop and you have a pipeline that can move data quickly but can't improve a model on its own — because the labels still have to come from somewhere. That "somewhere" is usually six to twelve months of building an internal labeling operation, or a generalist vendor whose 2D boxes fall apart in a 3D embodied context.
Avala collapses that into the loop itself. A visual anomaly caught in the field becomes a labeling task automatically, gets resolved by an agentic auto-labeler, is verified by a domain expert through consensus, and lands back in your training set within hours — with the provenance to prove it. Looking at your data and trusting your data are different problems. The loop is built around the second one.
The mental model for the rest of the course
Hold onto this shape:
raw recording → 4D scene → verified ground truth → curated dataset → trained model → deployed fleet → failures → (back to verified ground truth)
Each remaining lesson takes one arrow and shows you exactly how it works, in the UI and in code. Next, we start at the beginning: getting a real recording in and seeing it in 4D.