Generalizable Offline Multi-Objective Reinforcement Learning via Preference-Conditioned Diffuser

This repository contains reference implementation for paper Generalizable Offline Multi-Objective Reinforcement Learning via Preference-Conditioned Diffuser.

Install

Create conda environment:

cd diffmorl
conda env create -f environment.yml
conda activate diffmorl_env

Install the diffuser for DiffMORL:

cd diffuser
python -m pip install -e .

Data Download or Generation

You can download datasets from the PEDA repo following their instructions. Due to storage limit, we are unable to open-source all data variants. We recommend users to download the pretrained behavioral policies from the PEDA repo and generate all data following the examples in data_generation/collect_all.sh and data_generation/collect_custom.sh. Note that custom include the incomplete dataset used in our paper, which is tagged as custom-large. Other types of incomplete datasets can also be collected by modifying the settings in data_generation/custom_pref.py

Training and Evaluation

You shoulf include the path in your PYTHONPATH environment variables by running

export PYTHONPATH=<path-to-diffmorl>

One example here for a single experiment:

python experiment.py --dir experiment_runs/example --env MO-HalfCheetah-v2 --seed 2 --dataset expert_custom --model_type mod --mod_type bc --num_steps_per_iter 400000 --max_iters 1 --use_p_bar True --K 8 --infer_N 7 --n_diffusion_steps 8 --returns_condition True --mixup True --mixup_num 6 --mixup_step 400000

Other example commands are included in scripts/examples.sh. To reproduce our results (after you have collected all datasets to be used), run

sh scripts/diffmorl_main.sh

Double-check your CUDA device and data path in the shell scripts. After training, models will be evaluated automatically. The Pareto fronts and all metrics will be presented in the directory specified by --dir. Also, the models are saved to the same directory.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data_generation		data_generation
diffmorl		diffmorl
diffuser		diffuser
environments		environments
modt		modt
rvs		rvs
scripts		scripts
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
experiment.py		experiment.py
state_norm_params.py		state_norm_params.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalizable Offline Multi-Objective Reinforcement Learning via Preference-Conditioned Diffuser

Install

Data Download or Generation

Training and Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generalizable Offline Multi-Objective Reinforcement Learning via Preference-Conditioned Diffuser

Install

Data Download or Generation

Training and Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages