Multi-CALF: Critic as a Lyapunov Function

This repository implements the Multi-CALF approach — a method of combining two policies into a new, more effective one.

Demonstration of a Multi-CALF agent combining PPO and TD3 policies. Both policies were tuned to near-optimal performance on the Hopper-v4 environment with 10d observation space. The Hopper is rewarded for forward movement. The combined MCALF agent achieves superior performance, as shown in the animation above and outperforms the individual near-optimal policies through their strategic combination.

Overview

Multi-CALF is a technique that leverages the critic function as a Lyapunov function to guarantee stability in reinforcement learning agents. The approach facilitates safe exploration and robust policy development by using multiple critics to validate actions before execution.

Key features:

Safety-aware training with Lyapunov stability guarantees
Environmentally-conscious action selection
Support for visual observations in complex control tasks
Compatible with standard RL algorithms (PPO, TD3)
Performance tracking via MLflow

Repository Structure

multi-calf/
├── run/                      # Experiment scripts
│   ├── train_ppo.py          # Training script for PPO
│   ├── train_td3.py          # Training script for TD3
│   ├── eval.py               # Standard evaluation script
│   ├── eval_mcalf.py         # Multi-CALF evaluation script
│   ├── artifacts/            # Saved model artifacts
│   └── mlruns/               # MLflow experiment tracking data
├── src
├── pyproject.toml            # Dependencies and configuration
└── uv.lock                   # Lock file for uv package manager

Installation

This project uses uv for dependency management - a fast and reliable Python package installer.

Prerequisites

For rendering functionality, install the following dependencies:

sudo apt-get install -y libosmesa6-dev libgl1-mesa-dev libglfw3

Setting up with UV

Install UV:

# Linux/macOS
curl -sSf https://astral.sh/uv/install.sh | bash

# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Clone the repository:

git clone https://github.com/yourusername/multi-calf.git
cd multi-calf

Create and activate the environment:

# Create a virtual environment with specific Python version
uv venv --python=3.12

# Activate the environment
# On Linux/macOS:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install dependencies from lock file
uv sync

Running Experiments with UV

All experiments can be run using UV's run command to ensure proper environment resolution.

Training

To train a PPO agent:

uv run run/train_ppo.py

To train a TD3 agent:

uv run run/train_td3.

Command-line parameters provide extensive customization:

# View all available options
uv run run/train_ppo.py --help

Multi-CALF Evaluation

To evaluate using the Multi-CALF approach (requires two trained models):

uv run run/eval_mcalf.py \
  --env-id Pendulum-v1 \
  --base-checkpoint-path run/artifacts/ppo_checkpoints/ppo_checkpoint_500000_steps.zip \
  --alt-checkpoint-path run/artifacts/ppo_checkpoints/ppo_checkpoint_250000_steps.zip \
  --mcalf.relaxprob-init 1.0 \
  --mcalf.relaxprob-factor 0.999 \
  --mcalf.calf-change-rate 0.01

Experiment Tracking

View experiment results using MLflow:

cd run
mlflow ui --port=5000

Then navigate to http://localhost:5000 in your browser.

Troubleshooting UV

If you encounter issues with UV:

Cache problems: Clear the UV cache
```
uv cache clean
```
Package resolution issues: Update UV and retry
```
uv self update
uv sync
```
Environment conflicts: Create a fresh environment
```
rm -rf .venv
uv venv --python=3.12
uv sync
```

License

MIT License

Citation

If you use this code in your research, please cite:

[Citation information for the paper]

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
gfx		gfx
run		run
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-CALF: Critic as a Lyapunov Function

Overview

Repository Structure

Installation

Prerequisites

Setting up with UV

Running Experiments with UV

Training

Multi-CALF Evaluation

Experiment Tracking

Troubleshooting UV

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-CALF: Critic as a Lyapunov Function

Overview

Repository Structure

Installation

Prerequisites

Setting up with UV

Running Experiments with UV

Training

Multi-CALF Evaluation

Experiment Tracking

Troubleshooting UV

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages