
XDOF builds data pipelines, collection tools, and annotation systems for robotics foundation models.
XDOF builds infrastructure for robotics foundation models. Its core offering includes data pipelines, collection tools, and annotation systems that enable frontier AI labs and robotics companies to generate, clean, and label the physical interaction data needed to train capable robots.
The company operates across a three-tier data pyramid. The most valuable tier uses teleoperation on deployed robots. The middle tier applies general teleoperated collection through GELLO-style frameworks, while the third tier captures egocentric data from humans performing everyday tasks.
The robotics industry is racing toward general-purpose physical intelligence, but a critical data bottleneck remains. While generative models have consumed vast public text, robots need high-fidelity physical interaction data that cannot be scraped from the internet, creating demand for specialized data infrastructure.
XDOF is betting that data collection, cleaning, and annotation will become a foundational layer of the physical AI stack. Frontier AI labs and robotics companies appear willing to outsource this capital-intensive, operationally complex work rather than build warehouses of robots and train armies of teleoperators themselves.
XDOF differentiates through its full-stack expertise spanning hardware, operations, and policy training, rooted in Berkeley AI research. The founding team co-authored GELLO, a widely adopted low-cost teleoperation system, giving the company deep credibility in data collection tooling.
The company also combines data provision with cleaning, tooling, and annotation to create a self-reinforcing feedback loop for robot trainers. Its open-source ABC-130K dataset, released with UC Berkeley, contains more than 130,000 episodes across 195 bimanual manipulation tasks, demonstrating operational scale and research credibility.