ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation figure
AlphaXiv 中文论文页面(可滚动查看)