BagelVLA: Enhancing Long-Horizon Manipulation via Interleaved Vision-Language-Action Generation figure
AlphaXiv 中文论文页面(可滚动查看)