Awesome Robotics Manipulation · full_paper

ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation

作者：You Wu Zixuan Chen Cunxu Ou, Wenxuan Wang, Wenbo Huang, Lin Cao, Yangtao Chen, Weichao Qiu, Xingyue Quan, Jieqi Shi, Jing Huo, Yang Gao · 单位：School of Computer Science, Nanjing University School of Intelligence Science and Technology, Nanjing University · 会议/期刊：arXiv · 日期：2026-03-14 · 来源：Low-Level Learning-Based Action Modelling / Input Modelling / 3D Vision Language Action Models

三维表征视觉语言动作机器人学习操作

ST-VLA: Enabling 4D-Aware Spatiotemporal Understanding for General Robot Manipulation figure — AlphaXiv 中文论文页面（可滚动查看）