Universal Visuo-Tactile Video Understanding for Embodied Interaction figure
AlphaXiv 中文论文页面(可滚动查看)