SH
Shall-We-Dance/NCBI_sra_download
A Bash script for downloading sequencing data from the NCBI SRA database.
NCBI SRA Download Script
A Bash script for downloading sequencing data from the NCBI SRA database, converting .sra files to .fastq, compressing the output, and cleaning up intermediate files. This version adds parallel processing, progress bars, error handling, command-line configuration, and enhanced logging.
Features
- Batch download of SRA data from NCBI using a list of SRR accession numbers.
- FASTQ conversion via
fasterq-dumpwith progress monitoring. - Parallel compression of FASTQ files with
pigz. - Automatic cleanup of intermediate SRA folders.
- Fail-safe mechanism that logs failed SRR downloads or conversions.
- Adjustable CPU and memory usage per job.
- Color-coded log output to indicate download, warning, and error statuses.
- Automatically activates a conda environment with the required tools.
- Built-in
--helpsystem and input validation.
Usage
NCBI SRA Download Script v2.2.0
Usage: ncbi_sra_download.sh <srr_list.txt> [--cores N] [--parallel N] [--mem SIZE] [--env NAME] [-h|--help]
Downloads and converts SRA files to compressed FASTQ in parallel.
Required:
<srr_list.txt> A file containing one SRR accession ID per line.
Options:
--cores N Number of CPU threads per conversion job (default: max/4)
--parallel N Number of parallel downloads/conversions (default: 4)
--mem SIZE Memory per conversion job, e.g., 8G, 4096M (default: 4G)
--env NAME Name of the Conda environment to use (default: ncbi_sra_download)
-h, --help Show this help message and exitExample
# Download and convert SRR IDs listed in SRR_Acc_list.txt using default settings
./ncbi_sra_download.sh SRR_Acc_list.txt
# Download with custom settings: 8 cores, 2 parallel jobs, 8GB memory, and a specific conda environment
./ncbi_sra_download.sh SRR_Acc_list.txt --cores 8 --parallel 2 --mem 8G --env your_env_nameRequirements
Software
Conda Environment
Example conda environment (ncbi_sra_download) configuration:
conda create -n ncbi_sra_download sra-tools pigz pv -c biocondaOutput
.fastq.gzfiles will be saved in the same directory as the input SRR list.- Temporary
.sradirectories are automatically removed after conversion. - A log file (
sra_download.log) is generated, with detailed progress and errors. - Failed SRRs are logged in a separate file (
sra_failed.log), created only if there are failures.
Notes
-
Default settings use moderate CPU/memory. Customize with
--cores,--parallel, and--mem. -
The script prints configuration at runtime.
-
Must be executable:
chmod +x ncbi_sra_download.sh
Troubleshooting
- Check that conda and all required tools are available in your
PATH. - Use
--helpto check valid usage.
License
MIT License
Contact
For questions, issues, or contributions, please open an issue or pull request on GitHub.
On this page
Languages
Shell100.0%
Contributors
MIT License
Created May 14, 2025
Updated February 9, 2026