Repo Reading Notes for OpenPI
After reading the paper: π0: A Vision-Language-Action Flow Model for General Robot Control , I decided to spend a few days walking through the official implementation, openpi , to understand how everything work in practice. There are several questions I want to find answer. On the big side: how does this repo turn VLM features into robot actions, and how are training and inference actually wired together? On the smaller side: how is the two-expert MoE implemented, and how do observations influence the final action output? ...