Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone figure
AlphaXiv 中文论文页面(可滚动查看)