POLICEd RL: Learning Closed-Loop Robot Control Policies with Provable Satisfaction of Hard Constraints figure
AlphaXiv 中文概览(可滚动查看)