Skip to content

DearFishi/cleansrl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Safety-RL

Benchmarks

Env PPO PPO_Lagrangian PPO_CRPO PPO_CRPO_SR
Return Cost Return Cost Return Cost Return Cost
SafetyHalfCheetahVelocity-v1 4927.870±1265.715 823.125±219.088 2074.117±748.714 123.192±69.233 608.081±425.514 14.582±17.990 611.925±259.232 0.032±0.203
SafetySwimmerVelocity-v1 43.901±18.263 36.776±29.344 -4.965±5.780 0.321±2.787 15.633±3.435 34.992±14.742 17.970±6.626 2.469±5.474
SafetyPointGoal1-v0 11.829±6.338 95.632±112.685 -9.743±8.966 10.093±34.616 23.726±3.107 64.762±43.774 7.047±4.763 32.575±45.225

Supported algorithms

  1. PPO : Proximal Policy Optimization Algorithms
  2. PPO_Largrangian : Benchmarking Safe Exploration in Deep Reinforcement Learning
  3. PPO_CRPO : CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
  4. PPO_CRPO_SR : SAFETY REPRESENTATIONS FOR SAFER POLICY LEARNING

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages