althonos/pyinfernal
Cython bindings and Python interface to Infernal 1.1.
π PyInfernal 
Cython bindings and Python interface to Infernal.
πΊοΈ Overview
Infernal is a biological sequence analysis method that uses profile stochastic
context-free grammars called covariance models (CMs) to identify RNA structure and sequence similarities.
Infernal was developed by Eric P. Nawrocki
during his PhD thesis in the Eddy Laboratory.
pyinfernal is a Python package, implemented using the Cython
language, that provides bindings to Infernal. It directly interacts with the
Infernal internals. It builds on top of pyhmmer
and follows a generally similar interface.
This library is still very experimental and has not been thoroughly tested yet, use with caution.
π§ Installing
pyinfernal can be installed from PyPI,
which hosts some pre-built CPython wheels for Linux and MacOS on x86-64 and Arm64, as well as the code required to compile from source with Cython:
$ pip install pyinfernalCompilation for UNIX PowerPC is not tested in CI, but should work out of the
box. Note than non-UNIX operating systems (such as Windows) are not
supported by Infernal.
π Citation
PyInfernal is scientific software, and builds on top of Infernal. Please cite
the Infernal 1.1 application note
in Bioinformatics, for instance:
PyInfernal, a Python library binding to Infernal (Nawrocki & Eddy, 2013).
Also refer to the Infernal User's Guide
which contains a section about citation and reproducibility.
π‘ Example
Use pyinfernal to run cmsearch to search for the genome of
Escherichia coli str. K-12 substr. MG1655 (U00096.3)
for models from RFam. This will produce an iterable
over TopHits that can be used for further sorting/querying in Python.
Processing happens in parallel using Python threads,
and a TopHits object is yielded for every CM in the input iterable.
import pyhmmer.easel
import pyinfernal.cm
import pyinfernal.infernal
rna = pyhmmer.easel.Alphabet.rna()
with pyhmmer.easel.SequenceFile("U00096.3.fna", digital=True, alphabet=rna) as seq_file:
sequences = seq_file.read_block()
with pyinfernal.cm.CMFile("RFam.cm", alphabet=rna) as cm_file:
for hits in pyinfernal.cmsearch(cm_file, sequences, cpus=4):
print(f"CM {hits.query.name} ({hits.query.accession}) found {len(hits)} hits in the target sequences")π Feedback
β οΈ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the GitHub issue
tracker if you need to report
or ask something. If you are filing in on a bug, please include as much
information as you can about the issue, and try to recreate the same bug
in a simple, easily reproducible situation.
ποΈ Contributing
Contributions are more than welcome! See CONTRIBUTING.md for more details.
βοΈ License
This library is provided under the MIT License.
The Infernal code is available under the
BSD 3-clause license.
See vendor/infernal/LICENSE for more information.
This project is in no way affiliated, sponsored, or otherwise endorsed by
the original Infernal authors. It was developed by
Martin Larralde during his PhD project
at the Leiden University Medical Center in
the Zeller Lab.