Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation figure
AlphaXiv 中文论文页面(可滚动查看)