MiVLA: Towards Generalizable Vision-Language-Action Model with Human-Robot Mutual Imitation Pre-training figure
AlphaXiv 中文论文页面(可滚动查看)