GitHunt
MO

MolecularAI/REINVENT4

AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.

REINVENT 4

Description

REINVENT is a molecular design tool for de novo design, scaffold hopping,
R-group replacement, linker design, molecule optimization, and other small
molecule design tasks. REINVENT uses a Reinforcement Learning
(RL) algorithm to generate optimized molecules compliant with a user defined
property profile defined as a multi-component score. Transfer Learning (TL)
can be used to create or pre-train a model that generates molecules closer
to a set of input molecules.

A paper describing the software has been published as Open Access in the
Journal of Cheminformatics:
Reinvent 4: Modern AI–driven generative molecule design.
See AUTHORS.md for references to previous papers.

Requirements

REINVENT is being developed on Linux and supports both GPU and CPU. The Linux
version is fully validated. REINVENT on Windows supports GPU and CPU while
MacOSX supports CPU only, but both platforms are only partially tested and
therefore support is limited.

The code is written in Python 3 (>= 3.10). The list of
dependencies can be found in the repository (see also Installation below).

A GPU is not strictly necessary but strongly recommended for performance
reasons especially for transfer learning and model training. For Reinforcement
learning (RL) a GPU is less important becayse most scoring components run on
the CPU.

Note that if no GPU is installed in your computer the code will run on the
CPU automatically. REINVENT supports, as of this writing, NVIDIA GPUs, some AMD GPUs and Intel ARC.
For most design tasks a memory of about 8 GiB for both CPU main memory and
GPU memory is sufficient.

Installation

  1. Clone this Git repository. Add --depth 1 for only the newest version as the repository has grown quite large over time.
    git clone git@github.com:MolecularAI/REINVENT4.git  # --depth 1
  2. Create a Python environment and install a compatible version of Python, for example with Conda or other virtual environments.
    conda create --name reinvent4 python=3.10
    conda activate reinvent4
  3. Change directory to the repository to install all dependencies. You will need to set the right processor type, see PyTorch versions. Linux supports CUDA (e.g. "cu126"), AMD ROCm (e.g. "rocm6.4"), Intel XPU ("xpu") and CPU. Windows supports CUDA, XPU and CPU. MacOSX only supports CPU (use "mac" as processor type!). Optionally, you can select dependencies "openeye" (for ROCS; you need to obtain your own license), "isim" for similarity tracking in TensorBoard or "none" to skip all. The default is installation of "all" dependencies. See the help text from the install script for details.
    python install.py --help
    python install.py cu126  # or rocm6.4, xpu, cpu, mac, etc.
  4. Test the tool. The installer has added a script reinvent to your PATH.
    reinvent --help

Prior models

All public prior models can be found on Zenodo.

Basic Usage

REINVENT is a command line tool and works principally as follows

reinvent -l sampling.log sampling.toml

This writes logging information to the file sampling.log. If you wish to write
this to the screen, leave out the -l sampling.log part. sampling.toml is the
configuration file. The main format is TOML as it tends to be more user friendly. JSON and YAML are supported too.

Sample TOML configuration files for all run modes are located in configs/ in
the repository. File paths in these files need to be adjusted to your local
installation. You will need to choose a model and the appropriate run mode
depending on the research problem you are trying to address. There is
additional documentation in configs/ in several *.md files with
instructions on how to configure the TOML file. Internal priors can be
referenced with a dot notation (see reinvent/prior_registry.py).

Tutorials / Jupyter notebooks

Basic instructions can be found in the comments in the config examples in
configs/.

Notebooks are provided in the notebooks/ directory and contributed notebooks
and tutorials in contrib/. Please note that we provide the notebooks in
jupytext "light script" format. To work with the light scripts you will need
to install jupytext. A few other packages will come in handy too.

pip install jupytext mols2grid seaborn

The Python files in notebooks/ can then be converted to a notebook e.g.

jupytext -o Reinvent_demo.ipynb Reinvent_demo.py

Scoring Plugins

The scoring subsystem uses a simple plugin mechanism (Python
native namespace packages). If you
wish to write your own plugin, follow the instructions below.
There is no need to touch any of the REINVENT code. The public
repository contains a contrib directory with some useful examples.

  1. Create /top/dir/somewhere/reinvent\_plugins/components where /top/dir/somewhere is a convenient location for you.
  2. Do not place a __init__.py in either reinvent_plugins or components as this would break the mechanism. It is fine to create normal packages within components as long as you import those correctly.
  3. Place a file whose name starts with comp_* into reinvent_plugins/components or subdirectories. Files with different names will be ignored i.e. not imported. The directory will be searched recursively so structure your code as needed but directory/package names must be unique.
  4. Tag the scoring component class(es) in that file with the @add_tag decorator. More than one component class can be added to the same comp_ file. See existing code.
  5. Tag at most one dataclass for parameters in the same file, see existing code. This is optional.
  6. Set or add /top/dir/somewhere to the PYTHONPATH environment variable or use any other mechanism to extend sys.path.
  7. The scoring component should now automatically be picked up by REINVENT.

Ensure that the component can be important. The log file will write out an error if not. Check directly if import is possible:

from reinvent_plugins.components import comp_myscorer

Unit and Integration Tests

This is primarily for developers and admins/users who wish to ensure that the
installation works. The information here is not relevant to the practical use
of REINVENT. Please refer to Basic Usage for instructions on how to use the
reinvent command.

The REINVENT project uses the pytest framework for its tests. Before you run
them you first have to create a configuration file for the tests.

In the project directory, create a config.json file in the configs/ directory.
You can use the example config example.config.json as a base. Make sure that
you set MAIN_TEST_PATH to a non-existent directory. That is where temporary
files will be written during the tests. If it is set to an existing directory,
that directory will be removed once the tests have finished.

Some tests require a proprietary OpenEye license. You have to set up a few
things to make the tests read your license. The simple way is to just set the
OE_LICENSE environment variable to the path of the file containing the
license.

Once you have a configuration and your license can be read, you can run the tests.

$ pytest tests --json /path/to/config.json --device cuda

Languages

Python99.6%Shell0.4%
Apache License 2.0
Created October 20, 2023
Updated March 10, 2026