Silent DL Bugs
A research project for detecting silent bugs in deep learning (PyTorch) programs through mutation testing and static analysis.
Overview
Silent bugs in deep learning are subtle programming errors that don't cause immediate crashes but can lead to incorrect model behavior, reduced performance, or unexpected training dynamics. This project provides tools and datasets to systematically study and detect such bugs in PyTorch-based deep learning programs.
Key Features
- Comprehensive Dataset: 16 diverse PyTorch programs covering various domains (CNN, RNN, GAN, VAE, RL, GNN, etc.)
- Mutation Testing Framework: Three types of mutants to study different bug patterns
- PyTorch API Checker: Custom Pylint plugin for detecting PyTorch-specific patterns
- Automated Analysis: Scripts for running experiments and analyzing results
Dataset
The dataset contains 16 PyTorch programs (~2,700 lines total) covering different deep learning domains:
| Program | Domain | Description |
|---|---|---|
| p1_mnist.py | Computer Vision | Basic CNN for MNIST classification |
| p2_word_language_model.py | NLP | LSTM-based language model |
| p3_imagenet.py | Computer Vision | ImageNet classification with ResNet |
| p4_dcgan.py | Generative Models | Deep Convolutional GAN |
| p5_vae.py | Generative Models | Variational Autoencoder |
| p6_super_resolution.py | Computer Vision | Super-resolution CNN |
| p7_mnist_hogwild.py | Distributed Training | Multi-process MNIST training |
| p8_reinforcement_learning.py | Reinforcement Learning | Actor-Critic algorithm |
| p9_time_sequence_prediction.py | Time Series | LSTM for sequence prediction |
| p10_fast_neural_style.py | Computer Vision | Neural style transfer |
| p11_siamese_network.py | Computer Vision | Siamese network for similarity |
| p12_mnist_forward_forward.py | Novel Architectures | Forward-Forward algorithm |
| p13_regression.py | Regression | Simple neural network regression |
| p14_gat.py | Graph Neural Networks | Graph Attention Network |
| p15_gcn.py | Graph Neural Networks | Graph Convolutional Network |
| p16_mnist_rnn.py | Computer Vision + RNN | RNN-based MNIST classifier |
Project Structure
├── dataset/ # Original PyTorch programs (16 files)
├── absence_mutants/ # Mutants with removed API calls
├── live_mutants/ # Mutants that execute but may be incorrect
├── syntax_mutants/ # Syntax-level mutations
├── src/ # Analysis tools and checkers
│ ├── pytorch_api_checker.py # Custom Pylint plugin
│ ├── absence_mutate_pyfile.py # Absence mutation tool
│ ├── live_validate_pyfile.py # Live validation tool
│ └── syntax_validate_pyfile.py # Syntax validation tool
├── scripts/ # Automation scripts
│ ├── run_dataset_programs.sh # Run original programs
│ └── run_all_live_validate_programs.sh # Validate mutants
├── results/ # Output directory for results
├── data/ # Datasets and temporary files
├── pyproject.toml # Project configuration and dependencies
└── README.md # This file
The data/ and results/ directories are ignored by git but will be created when running examples.
Installation
-
Prerequisites: Python 3.10+
-
Clone the repository:
git clone https://github.com/Wsine/silentDLbugs.git cd silentDLbugs -
Install dependencies:
Option A: Using UV (recommended):
# Install UV if not available pip install uv uv syncOption B: Using pip:
pip install -e .Option C: Install dependencies manually:
pip install torch torchvision torchtext pylint ruff numpy matplotlib tqdm
Usage
Running the Dataset Programs
Run all dataset programs will also download the required data:
./scripts/run_dataset_programs.shRun specific programs:
./scripts/run_dataset_programs.sh mnist,dcgan,vaeAnalyzing Absence Mutants
Check for potential bugs using Ruff:
# If using UV
uv run ruff check absence_mutants/ --select E,F,PLE --ignore E401,F401,E402,F841 --output-format=json > absence_mutants/ruff_check.json
# If using pip
ruff check absence_mutants/ --select E,F,PLE --ignore E401,F401,E402,F841 --output-format=json > absence_mutants/ruff_check.jsonDetecting PyTorch API Patterns
Use the custom PyTorch API checker:
# If using UV
PYTHONPATH=./src uvx pylint --load-plugins=pytorch_api_checker --disable=all --enable=W9001 ./live_mutants/*/*.py
# If using pip
PYTHONPATH=./src pylint --load-plugins=pytorch_api_checker --disable=all --enable=W9001 ./live_mutants/*/*.pyCount detected patterns:
PYTHONPATH=./src pylint --load-plugins=pytorch_api_checker --disable=all --enable=W9001 ./live_mutants/*/*.py | grep "live_mutants" | cut -d ":" -f 1 | sort | uniq | wc -lRunning Live Validation
Validate live mutants:
./scripts/run_all_live_validate_programs.sh