InternVLA-M1/ST4VLA: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy figure
AlphaXiv 中文论文页面(可滚动查看)