Abstract
This paper studies adaptation of a predesigned policy from an original objective to a trade-off objective while explicitly constraining deviation from original performance. The method builds a closed-loop dynamics for policy parameters and uses control barrier functions to enforce constraint satisfaction during adaptation. Experiments on Cartpole, Lunar Lander, and a quadruped robot show practical and safe policy adaptation.
Demo
Citation
Hao, Wenjian, Zehui Lu, Nicolas Miguel, and Shaoshuai Mou. 2025. "A Control-Barrier-Function-Based Algorithm for Policy Adaptation in Reinforcement Learning." arXiv preprint arXiv:2510.02720.
@techreport{WHao2025cbfpolic,
author = {W Hao, Z Lu, N Miguel, S Mou},
year = {2025},
title = {A Control-Barrier-Function-Based Algorithm for Policy Adaptation in Reinforcement Learning},
number = {arXiv:2510.02720},
url = {https://arxiv.org/pdf/2510.02720}
}