GitHunt
VO

volpato30/MSRCall

MSRCall

MSRCall: A Multi-scale Deep Neural Network for Basecalling Oxford Nanopore Sequences.

Preparation

Data folder

You can put your own test reads in fast5 format in the dataset folder and modify the path in script run_0_preprocee_testsets.sh
In our paper, we used dataset from Ryan Wick et. al. that can be downloaded from this website:
https://doi.org/10.26180/5c5a5fa08bbee

Required packages

Our code is tested on cuda 10.0, cudnn 7.6

Please install ctcdecode from:
https://github.com/parlance/ctcdecode.git
Install required python packages:

pip install -r requirement.txt

optional software:
minimap2

Run

Preprocess test data

bash ./run_0_preprocee_testsets.sh

The preprocessed .npy files are put in the preprocess_test directory.

Run basecalling

python call.py -model exp_backup/MSRCall/MSRCall.chkpt -records_dir preprocessed_test/Acinetobacter_pittii_16_377_0801/ -output MSRCall_out

You can change the test data by replacing Acinetobacter_pittii_16_377_0801 with your own filename.
Basecalled results are stored in the MSRCall_out folder.

Dataset reference:

Wick, R. R., Judd, L. M., & Holt, K. E. (2019). Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome biology, 20(1), 1-10.

Contributors

Created November 6, 2022
Updated November 5, 2022
volpato30/MSRCall | GitHunt