Awesome Robotics Manipulation · full_paper

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance

作者：Pok´eVLA: Empowering Pocket-Sized Yupeng Zheng, Xiang Li, Songen Gu, Yuhang Zheng, Shuai Tian, Weize Li, Linbo Wang, Senyu Fei, Pengfei Li, Yinfeng Gao, Zebin Xing, Yilun Chen, Qichao Zhang, Haoran Li, Wenchao Ding · 单位：Fudan University, National University of Singapore, Tongji University · 会议/期刊：arXiv · 日期：2026-04-22 · 来源：Low-Level Learning-Based Action Modelling / Input Modelling / 2D Vision Language Action Models with Efficiency / Small Model

视觉语言动作机器人学习

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance figure — AlphaXiv 中文论文页面（可滚动查看）