TurakhiaLab/DIPPER
Distance-based Phylogenetic Placer
Table of Contents
Introduction
DIPPER (DIstance-based Phylogenetic PlacER) is a tool for ultrafast and ultralarge phylogenetic reconstruction on GPUs, designed to maintain high accuracy with a minimal memory footprint. DIPPER introduces several innovations, including a divide-and-conquer strategy, a new placement algorithm, and an on-the-fly distance calculator that dynamically enables selective distance computation. DIPPER consistently outperforms existing distance-based methods in speed, accuracy, and memory efficiency. In addition, DIPPER minimizes branch length underestimation for non-additive distance matrices compared to earlier methods and offers a strict mode that completely eliminates the underestimation.
Installation
DIPPER runs on modern Linux and macOS systems, supporting NVIDIA (CUDA) and AMD (HIP/ROCm) GPUs as well as CPU-only execution. Users may choose the installation method suitable for their requirements.
| Platform / Setup | Conda | Script | Docker |
|---|---|---|---|
| Linux (x86_64) | ✅ | ✅ | ✅ |
| Linux (aarch64) | ✅ | ✅ | ✅ |
| macOS (Intel Chip) | ✅ | ✅ | ✅ |
| macOS (Apple Silicon) | ✅ | ✅ | ✅ |
| NVIDIA GPU | ✅ | ✅ | ✅ |
| AMD GPU | ❌ | ✅ | ❌ |
1. Using Conda (Recommended)
DIPPER is available on multiple platforms via Conda. See DIPPER Bioconda Page for details.
i. Dependencies
ii. Create and activate a Conda environment
conda create -n dipper python=3.11 -y
conda activate dipper
# Set up channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
# Install DIPPER
conda install bioconda::dipper
conda install bioconda::dipper_cpu # CPU-onlyiii. Run DIPPER
# Inside the conda environment
dipper --help
# dipper_cpu --help # CPU-only2. Using Docker Image
To use DIPPER in a docker container, users can create a docker container from a docker image, by following these steps
i. Dependencies
ii. Pull and build the DIPPER docker image from DockerHub
## Note: If the Docker image already exists locally, make sure to pull the latest version using
## docker pull swalia14/dipper:latest # (NVIDIA-GPUs)
## docker pull swalia14/dipper_cpu:latest # (CPU-only)
## If the Docker image does not exist locally, the following command will pull and run the latest version
docker run -it --gpus all swalia14/dipper:latest # (NVIDIA-GPUs)
docker run -it swalia14/dipper_cpu:latest # (CPU-only)
iii. Run DIPPER
# Inside the docker container (path: /home/DIPPER/bin)
dipper --help
# dipper_cpu --help # CPU-only3. Using DockerFile
A Docker container with the preinstalled DIPPER program can also be built from a Dockerfile by following these steps.
i. Dependencies
ii. Clone the repository and build a docker image
git clone https://github.com/TurakhiaLab/DIPPER.git
cd DIPPER/docker
docker build -t dipper -f Dockerfile .
docker build -t dipper -f Dockerfile_cpu . # CPU-only iii. Build and run the docker container
docker run -it --gpus all dipperiv. Run DIPPER
# Inside the docker container (path: /home/DIPPER/bin)
./dipper --help
# dipper_cpu --help # CPU-only4. Using installation script (requires sudo access)
Users without sudo access are advised to install DIPPER via Docker Image or Dockerfile.
Step 1: Clone the repository
git clone https://github.com/TurakhiaLab/DIPPER.git
cd DIPPERStep 2: Install dependencies (requires sudo access)
DIPPER depends on the following common system libraries, which are typically pre-installed on most development environments:
- wget
- cmake
- build-essential
- libboost-all-dev
- libtbb-devFor Ubuntu users with sudo access, if any of the required libraries are missing, you can install them with:
sudo apt install -y wget cmake build-essential libboost-all-dev libtbb-devStep 3: Build DIPPER
cd install
chmod +x installUbuntu.sh
./installUbuntu.sh
cd ../Step 4: The DIPPER executable is located in the bin directory and can be run as follows:
cd bin
./dipper --helpRun DIPPER
For more information about DIPPER's options and instructions, see wiki for more details.
Note: All the files in the examples below can be found in the DIPPER/dataset.
./dipper with ./dipper_cpu
in the following commands.
Enter into the bin directory (assuming $DIPPER_HOME directs to the DIPPER repository directory). For the docker container $DIPPER_HOME is /home/DIPPER/bin
cd $DIPPER_HOME/bin
./dipper -hDe-novo phylogeny construction
DIPPER supports de-novo construction of phylogenies from unaligned/aligned sequences in FASTA format and distance matrix in PHYLIP format.
Default mode
In default mode, DIPPER constructs phylogeny using:
- Conventional NJ for sequences/tips < 30,000
- Placement technique for sequences/tips >= 30,000 and < 1,000,000
- Divide-and-conquer technique for sequences/tips >= 1,000,000
From unaligned sequences
Usage syntax
./dipper -i r -o t -I <path to unaligned sequences FASTA file> -O <path to output file>Example
./dipper -i r -o t -I ../dataset/t2.unaligned.fa -O tree.nwkFrom aligned sequences
Usage syntax (using JC model)
./dipper -i m -o t -d 2 -I <path to aligned sequences FASTA file> -O <path to output file>Example
./dipper -i m -o t -d 2 -I ../dataset/t1.aligned.fa -O tree.nwkFrom distance matrix
Usage syntax
./dipper -i d -o t -I <path to distance matrix PHYLIP file> -O <path to output file>Example
./dipper -i d -o t -I ../dataset/t2.phy -O tree.nwkConstruct phylogeny using placement technique
DIPPER allows users to construct phylogeny using the forced placement technique by setting the -m option to 1. Below we provide a syntax and an example for input unaligned sequences, but DIPPER also supports aligned sequences and distance matrix as input.
Usage syntax
./dipper -i r -o t -m 1 -I <path to unaligned sequences FASTA file> -O <path to output file>Example
./dipper -i r -o t -m 1 -I ../dataset/t2.unaligned.fa -O tree.nwkConstruct phylogeny using divide-and-conquer technique
DIPPER allows users to construct phylogeny using the forced divide-and-conquer technique by setting the -m option to 3. Below we provide a syntax and an example for input unaligned sequences, but DIPPER also supports aligned sequences and distance matrix as input.
Usage syntax
./dipper -i r -o t -m 3 -I <path to unaligned sequences FASTA file> -O <path to output file>Example
./dipper -i r -o t -m 3 -I ../dataset/t2.unaligned.fa -O tree.nwkAdding tips (sequences) to a backbone tree
DIPPER allows users to add tips to an existing backbone tree using the placement technique. It requires tip sequences from the backbone tree and input query sequences to be provided in a single file (FASTA format), along with the input tree in Newick format.
Usage syntax
./dipper -i r -o t -m 1 --add -I <path to unaligned/aligned sequences FASTA file (containing backbone tree tip sequences and query sequences)> -O <path to output file> -t <path to input tree>Example
./dipper -i r -o t -m 1 --add -I ../dataset/t2.unaligned.fa -O tree.nwk -t ../dataset/backbone.nwkReproduce DIPPER results
To reproduce DIPPER results provided here: https://zenodo.org/records/17259722, follow the instructions provided in scripts/reproduce_results.sh
Contributions
We welcome contributions from the community to enhance the capabilities of DIPPER. If you encounter any issues or have suggestions for improvement, please open an issue on DIPPER GitHub page. For general inquiries and support, reach out to our team.
Citing DIPPER
If you use DIPPER in your research or publications, we kindly request that you cite the following paper:
- Sumit Walia, Zexing Chen, Yu-Hsiang Tseng, Yatish Turakhia, "Ultrafast and Ultralarge Distance-Based Phylogenetics Using DIPPER", bioRxiv 2025.08.12.669583; doi: https://doi.org/10.1101/2025.08.12.669583
