Bridging Perception and Action: Spatially-Grounded Mid-Level Representations for Robot Generalization figure
AlphaXiv 中文概览(可滚动查看)