UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent figure
AlphaXiv 中文论文页面(可滚动查看)