LangForce: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries figure
AlphaXiv 中文论文页面(可滚动查看)