Learning to Learn Faster from Human Feedback with Language Model Predictive Control figure
AlphaXiv 中文概览(可滚动查看)