Titan AI LogoTitan AI

safe-rlhf

1,529
123
Python

Project Description

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Project Information

Created on 5/15/2023
Updated on 9/23/2025