Continuous Vision-Language-Action Co-Learning with Semantic-Physical Alignment for Behavioral Cloning figure
AlphaXiv 中文论文页面(可滚动查看)