ATA: Bridging Implicit Reasoning with Attention-Guided and Action-Guided Inference for Vision-Language Action Models figure
AlphaXiv 中文论文页面(可滚动查看)