CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification figure
AlphaXiv 中文论文页面(可滚动查看)