Official Implementation of NGT with CUDA Graphs

Official PyTorch implementation of the Noise-Guided Transport (NGT) algorithm
along with baselines.
The paper that introduces it is:
Noise-Guided Transport: Scalable Imitation Learning from Random Priors.
It uses PyTorch's
CUDA Graphs.

Prerequisites

GPU: NVIDIA GPU with CUDA support
CUDA Version: CUDA 12.5 or higher (for GPU support)
NVIDIA Drivers: Compatible with your CUDA version (e.g. 555)
(Optional) For Docker:
- Docker: version 27.3.1
- NVIDIA Container Toolkit for GPU support
Expert Demonstrations: Download from this link

Installation

The project is set up with Poetry,
with the option to containerize with Docker
instead of a local Poetry install.

Clone the Repository

git clone https://github.com/lionelblonde/ngt-pytorch.git
cd ngt-pytorch

Setup wandb

Add your_wandb_api_key as the environment variable WANDB_API_KEY
in the .env file (the file will be created if it does not exist):

echo "WANDB_API_KEY=your_wandb_api_key" >> .env

Install Poetry

Install Poetry directly with the official installer:

curl -sSL https://install.python-poetry.org | python3 -

Also install Poethepoet.
Poethepoet enables the creation of macros,
which are defined in pyproject.toml.
The macros defined there facilitate both pure-Poetry and Docker usages.
The following command installs it globally, i.e. available across all projects:

poetry self add poethepoet

Also install poetry-dotenv-plugin.
This plugin handles .env files seamlessly,
no need to export nor specify any env-file.
Just put every environment variable you need in the .env file
at the root of the project.
In particular, having you wandb API key in the file as described above
is enough for the login to be successful,
whether you choose to use Docker or not.
The following command installs it globally, i.e. available across all projects:

poetry self add poetry-dotenv-plugin

Make sure that the poetry and poe bash command are on the PATH.
Run this directly or add this to your .bashrc/.zshrc:

export PATH="$HOME/.local/bin:$PATH"

Double-check that they are available with:

poetry --version
poe --version

Install project dependencies

Option A: Poetry

poetry install

If that command hangs, use the verbose version: poetry install -vvv.
If you see that it hangs at "Using keyring backend 'SecretService Keyring'",
you can configure Poetry to not use the keyring for managing credentials:

export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring

Option B: Docker

(Optional) Build and push the Docker image to Dockerhub:

poe dbp

N.B.: optional because the image is already on Dockerhub.

Usage

You can either train or evaluate.
What follows describes how to run either task
on either pure-Poetry or Docker.

N.B. 1: -O (capital "o") turns the value of the constant __debug__ to False,
making Python run in optimized mode.
In this mode, Python ignores the assert statements. Use with caution when prototyping.
Plus, in this mode, the beartype decorator is turned off, hence reducing its tiny overhead time.
Conversely, running the command without the -O option activates beartype real-time type-checking functionalities.

N.B. 2: The evaluate task evaluates a model
by retrieving its best performing checkpoint stored on Weights & Biases servers.
What should be given as value to the --load_ckpt argument
is the "Run Path" on the Weighs & Biases overview page of the run to evaluate.
The parameters of the best model are downloaded in a temporary directory.

Option A: Poetry

Train

poetry run python -O main.py train --base_cfg="tasks/defaults/base.yml" --override_cfg="tasks/defaults/overrides/your_override.yml" --env_id="Hopper-v4" --seed=0 --num_demos=4 --subsampling_rate=20 --expert_path="./experts" --description="teeny description"

Evaluate

poetry run python -O main.py evaluate --base_cfg="tasks/defaults/base.yml" --override_cfg="tasks/defaults/overrides/your_override.yml" --env_id="Hopper-v4" --seed=0 --load_ckpt="wandb_run_path" --description="teeny description"

Option B: Docker

Run the macro:

poe docker

You are now inside the container.

Train

python -O main.py train --base_cfg="tasks/defaults/base.yml" --override_cfg="tasks/defaults/overrides/your_override.yml" --env_id="Hopper-v4" --seed=0 --num_demos=4 --subsampling_rate=20 --expert_path="./experts" --description="teeny description"

Evaluate

python -O main.py evaluate --base_cfg="tasks/defaults/base.yml" --override_cfg="tasks/defaults/overrides/your_override.yml" --env_id="Hopper-v4" --seed=0 --load_ckpt="wandb_run_path" --description="teeny description"

Extras

Expert Performance

To print the performance (mean and standard deviation)
of the experts across all environments and demonstrations stored in .h5 files, use:

poetry run python -O inspect_experts.py --expert_path="./experts"

Download progress files from wandb

To download the progress.json and progress.csv files that have
tracked the performance of agents during training, the following will
download all those files for a given wandb group:

poetry run python -O retrieve_wandb.py --wandb_id="wand_id" --wandb_project="wand_project" --group_name="group_name" --download_dir="download_dir"

Advanced: Spawn Arrays of Jobs

The spawner.py enables the creation (and launch) of an array of experiments
on a Slurm-administrated cluster,
or locally in a new tmux session, with one experiment running per window in the session.

Here is how it can be used:

poetry run python spawner.py --base_cfg="tasks/defaults/base.yml" --override_cfg="tasks/defaults/overrides/your_override.yml" --env_bundle="huma" --deployment="slurm" --num_seeds=3 --num_demos="[1,4]" --subsampling_rate="[1,20]" --expert_path="./experts" --deploy_now  --description="teeny description"

To run the same command on a list of bundles, use this:

echo "walker cheetah ant huma" | tr ' ' '\n' | parallel 'poetry run python spawner.py --base_cfg="tasks/defaults/base.yml" --override_cfg="tasks/defaults/overrides/your_override.yml" --env_bundle={} --deployment="slurm" --num_seeds=4 --num_demos="[1, 4]" --subsampling_rate="[1, 20]" --expert_path="./experts" --deploy_now --debug --description="teeny description"'

This command uses GNU parallel which by default
determines the number of jobs based on the length of the input list or based on available CPU resources.
To force parallel to use a certain number of cores, use the -j option, e.g. parallel -j 4 for 4 cores.
The software is usually available by default on HPC clusters.

To create the scripts without deploying them immediately, use --nodeploy_now instead of --deploy_now.
This logic applies to all the boolean options since we are using google/python-fire.

License

This project is licensed under the MIT License. See the LICENSE file for details.

lionelblonde/ngt-pytorch

Official Implementation of NGT with CUDA Graphs

Prerequisites

Installation

Clone the Repository

Setup wandb

Install Poetry

Install project dependencies

Option A: Poetry

Option B: Docker

Usage

Option A: Poetry

Train

Evaluate

Option B: Docker

Train

Evaluate

Extras

Expert Performance

Download progress files from wandb

Advanced: Spawn Arrays of Jobs

License

On this page

Languages

Contributors