Imagine, Verify, Execute: Memory-guided Agentic Exploration with Vision-Language Models figure
AlphaXiv 中文概览(可滚动查看)