GitHunt
LQ

LQNew/ValueEstimationRL

Compute Q-Value Estimation for RL in MuJoCo environment.

Example Code for computing value estimation bias in MuJoCo

How to compute value estimation ?

Instructions

Recommend: Run with Docker

# python        3.6    (apt)
# pytorch       1.4.0  (pip)
# tensorflow    1.14.0 (pip)
# Atari, DMC Control Suite, and MuJoCo
# Attention: Need `mjkey.txt`!!
cd dockerfiles
docker build . -t estimationRL

For other dockerfiles, you can go to RL Dockefiles.

Launch experiments

Run with the scripts batch_run_value_estimation_4seed_cuda.sh:

# eg.
bash batch_run_value_estimation_4seed_cuda.sh Humanoid-v2 DDPG_value_estimation 0  # env_name: Humanoid-v2, algorithm: DDPG, CUDA_Num : 0

Plot results

Recommend: Install Seaborn==0.8.1 for ploting value estimation bias

  • pip install seaborn==0.8.1
  • Example1: plot the value estimation bias of DDPG and TD3:
    python spinupUtils/plot_bias.py \
        data/DDPG_value_estimation-HalfCheetah-v2-estimation/ \
        data/TD3_value_estimation-HalfCheetah-v2-estimation \
        --env HalfCheetah-v2 \
        -l  DDPG TD3 -s 0

  • Example2: plot the reward of DDPG and TD3:
    python spinupUtils/plot_reward.py \
        data/DDPG_value_estimation-HalfCheetah-v2-reward/ \
        data/TD3_value_estimation-HalfCheetah-v2-reward \
        --env HalfCheetah-v2 \
        -l  DDPG TD3 -s 0

Citation

@misc{QingLi2021ValueEstimationRL,
  author = {Qing Li},
  title = {ValueEstimationRL,
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LQNew/ValueEstimationRL}}
}

Languages

Python87.3%Dockerfile7.1%Shell5.6%

Contributors

MIT License
Created June 3, 2021
Updated October 30, 2022
LQNew/ValueEstimationRL | GitHunt