Awesome Robotics Manipulation · full_paper

Clutter-Resistant Vision-Language-Action Models through Object-Centric and Geometry Grounding

作者：Taisei Hanyu, Yuki Ikebe, Trong Thang Pham, Nhat Chung, Minh Nhat Vu, Duy Nguyen Ho Minh Anh Nguyen, Anthony Gunderman, Chase Rainwater, Ngan Le · 单位：Max Planck Research School for Intelligent, Systems and the University of Stuttgart, Stuttgart, Germany · 会议/期刊：arXiv · 日期：2025-12-27 · 来源：Low-Level Learning-Based Action Modelling / Input Modelling / 2D Vision Language Action Models with Auxiliary Tasks - Visual Goal Extraction

辅助任务视觉语言动作对象中心感知机器人学习

Clutter-Resistant Vision-Language-Action Models through Object-Centric and Geometry Grounding figure — AlphaXiv 中文论文页面（可滚动查看）