EndoVLA: Dual-Phase Vision-Language-Action for Precise Autonomous Tracking in Endoscopy figure
AlphaXiv 中文概览(可滚动查看)