RealVLG-R1: A Large-Scale Real-World Visual-Language Grounding Benchmark for Robotic Perception and Manipulation figure
AlphaXiv 中文论文页面(可滚动查看)