mimic-video: Video-Action Models for Generalizable Robot Control Beyond VLAs figure
AlphaXiv 中文论文页面(可滚动查看)