Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos figure
AlphaXiv 中文论文页面(可滚动查看)