Awesome Robotics Manipulation · full_paper

Green-VLA: Staged Vision-Language-Action Model for Generalist Robots

作者：Manipulation Team, R ) multi-embodiment pretraining, R ) embodiment-specific adaptation, physical priors, learns shared affordances, control stack for · 单位：robot fleets. A scalable data-processing pipeline including DataQA, Sber Robotics Center · 会议/期刊：arXiv · 日期：2026-01-31 · 来源：Low-Level Learning-Based Action Modelling / Input Modelling / 2D LLM-based Vision Language Action Models

视觉语言动作基础模型语言条件机器人学习

Green-VLA: Staged Vision-Language-Action Model for Generalist Robots figure — AlphaXiv 中文论文页面（可滚动查看）