FAST: Efficient Action Tokenization for Vision-Language-Action Models figure
AlphaXiv 中文概览(可滚动查看)