Awesome Robotics Manipulation · full_paper

InternVLA-M1/ST4VLA: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

作者：InternVLA-M : A Spatially Guided Vision-Language-Action · 单位：Collaboration et al. (2023), © Shanghai Artificial Intelligence Laboratory. All rights reserved · 会议/期刊：ICLR 2026 · 日期：2025-10-15 · 来源：Low-Level Learning-Based Action Modelling / Input Modelling / 2D LLM-based Vision Language Action Models

视觉语言动作基础模型语言条件机器人学习操作

InternVLA-M1/ST4VLA: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy figure — AlphaXiv 中文论文页面（可滚动查看）