A Review of Robbyant’s Early-2026 Work

Mon, 16 Mar 2026 20:21:33 +0000

Robbyant is a company under Ant Group, dedicated to building the foundational platform for Embodied AI, bridging the gap between digital intelligence and the physical world.

Since the company is still relatively new, I want to quickly review its recent work. In particular, I will study four embodied intelligence model models: spatial perception model, VLA model, world model, and video action model.

This diagram in the homepage of Robbyant reflects the vision for embodied intelligence: starting from sensory input, the system first builds spatial intelligence to understand the physical world, then relies on an action model to make decisions and interact with the environment, and finally improves through environmental reward.

Vision-Action-Model on Tan Ke

A Review of Robbyant’s Early-2026 Work