GitHunt
ZH

ZhaohanM/ExplainBind

Explainable Physicochemical Determinants of Protein–Ligand Binding via Non-Covalent Interactions

ExplainBind Logo

Explainable Physicochemical Determinants of Protein–Ligand Binding via Non-Covalent Interactions

🔥 News

  • [March 2026] ⛳ Our preprint is now available on biorXiv.
  • [Feb 2026] 🚀 ExplainBind demo UI is now live on Hugging Face Spaces!

🧩 Overview

ExplainBind is an interaction-aware framework for protein–ligand binding (PLB) prediction.
It supervises token-level cross-attention using non-covalent interaction maps (e.g. hydrogen bonds, salt bridges, hydrophobic contacts, van der Waals, π–π, and cation–π interactions) derived from curated PDB protein–ligand complexes in InteractBind.
By aligning model attention with these physically grounded signals, ExplainBind transforms PLB prediction from a black-box reasoning into an chemistry-grounded process suitable for large-scale screening.

ExplainBind Framework

framework

📖 Contents

⚙️ Installation

Tip

Clone this Github repo and set up a new conda environment.

# create a new conda environment
$ conda create --name ExplainBind python=3.9
$ conda activate ExplainBind

# install requried python dependencies
$ pip install -r requirements.txt

# clone the source code of ExplainBind
$ git https://github.com/ZhaohanM/ExplainBind.git
$ cd ExplainBind

Requires: Python ≥ 3.9 and a CUDA-compatible GPU.

⚡ Quick Start

Command-Line Inference

bash run.sh

🔬 Foundation Models

🧬 Protein Foundation Models

Model Name HuggingFace Link Input Type
ESM2 facebook/esm2_t33_650M_UR50D Amino Acid Sequence
SaProt westlake-repl/SaProt_650M_AF2 Structure-aware Sequence
SaProt westlake-repl/SaProt_650M_PDB Structure-aware sequence

💊 Molecular Foundation Models

Model Name HuggingFace Link Input Type
MoLFormer-XL ibm-research/MoLFormer-XL-both-10pct SMILES
SELFormer HUBioDataLab/SELFormer SELFIES
SELFIES-TED ibm-research/materials.selfies-ted SELFIES

Note

All foundation models remain frozen. ExplainBind trains the Fusion Module using structure-derived attention map supervision and the Classifier.

🧫 Dataset

We provide 9 benchmarks with true residue–level interaction maps for PLI prediction evaluation. It will release soon!

Dataset Type Example Use
InteractBind (affinity) Affinity score splits Evaluate in-domain
InteractBind-P-25%/28%/31%/33% Protein similarity splits Evaluate sequence-level generalisation
InteractBind-L-08%/35%/40%/59% Ligand similarity splits Evaluate sequence-level generalisation

📚 Acknowledgments

This work was supported in part by National Institutes of Health grants HL155107 and HL166137, and by American Heart Association MERIT award AHA1185447 to JL.
K.Y. acknowledges support from Cancer Research UK (EDDPGM-Nov21/100001, DRCMDP-Nov23/100010 and core funding to the CRUK Scotland Institute (A31287)), BBSRC BB/V016067/1, Prostate Cancer UK MA-TIA22-001 and EU Horizon 2020 grant ID: 101016851.


📜 License

This project is licensed under the MIT License — see the LICENSE file for details.


🧰 Intended Use

ExplainBind is designed to assist computational biologists, AI researchers, and drug-discovery scientists in analysing and explaining molecular interactions.

Applications

  • 🔬 Drug Discovery — Identify explainable binding fingerprints between novel compounds and proteins.
  • 🧠 Model Explainability — Quantify token-level biological grounding via attention-map supervision.
  • 🧪 Cross-Domain Generalisation — Diagnose prediction drop-offs across protein similarity strata.

Important

This framework is intended solely for research purposes and should not be used for clinical decision-making.

ZhaohanM/ExplainBind | GitHunt