HALO: Human Preference Aligned Offline Reward Learning for Robot Navigation figure
AlphaXiv 中文概览(可滚动查看)