Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model figure
AlphaXiv 中文论文页面(可滚动查看)