MM-ACT: Learn from Multimodal Parallel Generation to Act figure
AlphaXiv 中文论文页面(可滚动查看)