umitkacar/multimodal-deepfake-detector
EfficientNet+MTCNN deepfake detection achieving 87% accuracy with 12.96% EER - Multi-modal video forensics for misinformation prevention
🎭 DeepFake Detection using EfficientNet
Features • Performance • Installation • Usage • Models • Research • Contributing
🌟 Overview
DeepFake-EfficientNet is a cutting-edge deep learning solution for detecting manipulated facial content using the powerful EfficientNet architecture. This project leverages state-of-the-art computer vision techniques combined with MTCNN face detection to achieve high accuracy in identifying deepfake videos and images.
🎯 Mission: Combat the spread of misinformation by providing robust, accurate, and efficient deepfake detection tools accessible to researchers and developers worldwide.
🔥 Why This Project?
|
⚡ High Performance 87.04% accuracy with optimized EfficientNet architecture |
🎓 Research-Backed Built on latest 2024-2025 SOTA methods and best practices |
🚀 Production-Ready Easy integration with pre-trained models and comprehensive notebooks |
✨ Features
| Feature | Description |
|---|---|
| 🧠 EfficientNet Architecture | Utilizes EfficientNet for optimal accuracy-efficiency trade-off |
| 👤 MTCNN Face Detection | Advanced Multi-task Cascaded Convolutional Networks for precise face extraction |
| 📊 High Accuracy | Achieves up to 87.04% accuracy with low EER of 12.96% |
| 📹 Video Processing | Efficient frame extraction and batch processing pipeline |
| 🎯 Transfer Learning | Pre-trained models ready for fine-tuning on custom datasets |
| 💻 Production Scripts | Professional Python scripts for training, testing, and inference |
| ⚙️ Configurable Pipeline | Modular design for easy customization and experimentation |
| 🔬 Research-Grade | Implements cutting-edge techniques from 2024-2025 research |
📈 Performance
| Model | Accuracy | Equal Error Rate (EER) | Dataset | Download |
|---|---|---|---|---|
| Model-1 (Default) | 84.2% | 15.8% | Standard Training Set | 📥 Download |
| Model-2 (More Data) | 87.04% | 12.96% | Extended Training Set | 📥 Download |
📊 Performance Metrics Visualization
Model-1 Performance Model-2 Performance (Enhanced)
━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Accuracy: ████████░░ 84.2% Accuracy: █████████░ 87.04%
Precision: ████████░░ 82.5% Precision: █████████░ 85.8%
Recall: ████████░░ 83.1% Recall: █████████░ 86.2%
F1-Score: ████████░░ 82.8% F1-Score: █████████░ 86.0%
🛠️ Installation
Prerequisites
🐍 Python 3.8+
🔢 CUDA 11.0+ (for GPU acceleration)
💾 8GB+ RAM recommendedQuick Start
# Clone the repository
git clone https://github.com/umitkacar/DeepFake-EfficientNet.git
cd DeepFake-EfficientNet
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt📦 Required Packages
Click to expand package list
tensorflow>=2.8.0
torch>=1.12.0
torchvision>=0.13.0
opencv-python>=4.6.0
mtcnn>=0.1.1
numpy>=1.21.0
pandas>=1.3.0
matplotlib>=3.5.0
seaborn>=0.11.0
scikit-learn>=1.0.0
Pillow>=9.0.0
tqdm>=4.62.0
jupyter>=1.0.0
efficientnet-pytorch>=0.7.1🚀 Usage
1️⃣ Face Extraction from Videos/Images
Extract faces using MTCNN for dataset preparation:
# Extract faces from videos
python scripts/extract_faces.py \
--input-dir /path/to/videos \
--output-dir /path/to/extracted_faces \
--mode video \
--batch-size 60 \
--frame-skip 30
# Extract faces from images
python scripts/extract_faces.py \
--input-dir /path/to/images \
--output-dir /path/to/extracted_faces \
--mode image2️⃣ Training the Model
Train EfficientNet on your dataset:
python scripts/train.py \
--train-real /path/to/train/real \
--train-fake /path/to/train/fake \
--val-real /path/to/val/real \
--val-fake /path/to/val/fake \
--output-dir outputs \
--batch-size 32 \
--epochs 20 \
--lr 8e-4 \
--model efficientnet-b1Features:
- ✅ Automatic checkpointing every epoch
- ✅ Training history visualization
- ✅ Best model selection based on validation accuracy
- ✅ Comprehensive logging
- ✅ Resume training from checkpoint
3️⃣ Testing & Evaluation
Evaluate model on test set with comprehensive metrics:
python scripts/test.py \
--test-real /path/to/test/real \
--test-fake /path/to/test/fake \
--checkpoint outputs/checkpoints/best_model.pth \
--output-dir test_results \
--batch-size 100 \
--save-predictionsOutputs:
- 📊 Comprehensive metrics (EER, ACER, APCER, NPCER, accuracy, AUC-ROC)
- 📈 Confusion matrix visualization
- 📉 ROC curve (FAR vs FRR)
- 💾 Predictions CSV file
4️⃣ Inference on New Data
Run inference on single images or directories:
# Single image inference
python scripts/inference.py \
--input /path/to/image.jpg \
--checkpoint outputs/checkpoints/best_model.pth \
--model efficientnet-b1
# Batch inference on directory
python scripts/inference.py \
--input /path/to/images/directory \
--checkpoint outputs/checkpoints/best_model.pth \
--model efficientnet-b1 \
--output predictions.csv🎯 Quick Python API Example
import torch
from deepfake_detector.models import DeepFakeDetector
from deepfake_detector.data import get_val_transforms
import cv2
# Load model
model = DeepFakeDetector(model_name='efficientnet-b1')
model.load_checkpoint('outputs/checkpoints/best_model.pth')
model.eval()
# Prepare image
transform = get_val_transforms(image_size=240)
image = cv2.imread('face.jpg')
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image_tensor = transform(image=image_rgb)['image'].unsqueeze(0)
# Predict
with torch.no_grad():
output = model(image_tensor)
prediction = torch.softmax(output, dim=1)
is_fake = prediction[0][1].item() > 0.5
print(f"🎭 Prediction: {'FAKE' if is_fake else 'REAL'}")
print(f"📊 Confidence: {max(prediction[0]).item():.2%}")🧬 Project Structure
DeepFake-EfficientNet/
│
├── deepfake_detector/ # Main package
│ ├── __init__.py
│ ├── models/ # Model definitions
│ │ ├── __init__.py
│ │ └── efficientnet.py # EfficientNet-based detector
│ ├── data/ # Data loading and preprocessing
│ │ ├── __init__.py
│ │ ├── dataset.py # Dataset classes
│ │ ├── loader.py # DataLoader utilities
│ │ └── transforms.py # Augmentation pipelines
│ ├── utils/ # Utility functions
│ │ ├── __init__.py
│ │ ├── metrics.py # Evaluation metrics (EER, ACER, etc.)
│ │ ├── logger.py # Logging utilities
│ │ └── visualization.py # Plotting functions
│ └── config/ # Configuration management
│ ├── __init__.py
│ └── config.py # Config dataclass
│
├── scripts/ # Executable scripts
│ ├── extract_faces.py # MTCNN face extraction
│ ├── train.py # Training script
│ ├── test.py # Testing/evaluation script
│ └── inference.py # Inference script
│
├── checkpoints/ # Model checkpoints (created during training)
├── logs/ # Training logs (created during training)
├── results/ # Results and visualizations
│
├── requirements.txt # Python dependencies
├── README.md # This file
├── LICENSE # MIT License
└── .gitignore # Git ignore rules
📦 Package Organization
The codebase follows industry best practices:
- ✅ Modular Design: Separation of concerns with dedicated modules
- ✅ Type Hints: Full type annotation for better IDE support
- ✅ Docstrings: Comprehensive documentation for all functions/classes
- ✅ Logging: Structured logging throughout the pipeline
- ✅ Error Handling: Robust error handling and validation
- ✅ Configuration: YAML/JSON config file support
- ✅ Testing Ready: Easy to add unit tests
- ✅ Production Ready: Clean, maintainable, scalable code
🎓 SOTA Research (2024-2025)
Stay up-to-date with cutting-edge deepfake detection research and implementations:
📚 Trending Research Papers (2024-2025)
| Paper | Venue | Key Innovation | Link |
|---|---|---|---|
| Frequency-Aware Deepfake Detection | AAAI 2024 | Frequency space domain learning for better generalization | arXiv |
| LAA-Net | 2024 | Localized Artifact Attention Network for quality-agnostic detection | Paper |
| DeepfakeBench | ICML 2025 | Comprehensive benchmark with 36+ detection methods | GitHub |
| Deepfake-Eval-2024 | 2024 | In-the-wild benchmark with 45h video, 56.5h audio, 1,975 images | arXiv |
| MultiFF Dataset | 2024 | 80+ atomic generation algorithms for robust testing | Challenge |
🌐 Top GitHub Repositories & Resources
🏅 Comprehensive Benchmarks
|
🎯 Specialized Resources
|
🔮 Current Research Challenges (2024-2025)
⚠️ Generalization Gap
└─ Academic benchmarks vs real-world deepfakes
⚠️ Adversarial Robustness
└─ Detection methods vs advancing generation techniques
⚠️ Multimodal Detection
└─ Unified detection across video, audio, and images
⚠️ Real-time Processing
└─ Balancing accuracy with computational efficiency
⚠️ Cross-dataset Performance
└─ Models trained on controlled datasets struggle with wild data
🎯 State-of-the-Art Techniques
| Technique | Description | Advantage |
|---|---|---|
| 🌊 Frequency Domain Analysis | Analyze frequency patterns to detect manipulation artifacts | Better generalization across different deepfake methods |
| 🎨 Artifact-based Detection | Focus on local inconsistencies and generation artifacts | High precision on modern GANs and diffusion models |
| 🧩 Multimodal Fusion | Combine video, audio, and metadata signals | Robust against single-modality attacks |
| 🔄 Contrastive Learning | Self-supervised learning for better feature representation | Improved zero-shot detection capabilities |
| 🌐 Transformer Architectures | Vision transformers for spatial-temporal analysis | State-of-the-art performance on recent benchmarks |
📊 2024-2025 Performance Trends
Key Findings from Latest Research:
- ✅ LAA-Net achieves quality-agnostic detection across multiple datasets
- ✅ XCeption maintains balanced performance with low false positive rates
⚠️ Real-world challenge: SOTA models show 45-50% AUC drop on in-the-wild data- 🔄 Diffusion models spark renewed research in detection methods
- 🎯 Audio deepfake detection remains challenging with ITW datasets
💡 Key Innovations in This Project
graph LR
A[Input Video] --> B[MTCNN Face Detection]
B --> C[Frame Extraction]
C --> D[Data Augmentation]
D --> E[EfficientNet Encoder]
E --> F[Classification Head]
F --> G{Real or Fake?}
G -->|Real| H[✅ Authentic]
G -->|Fake| I[❌ Deepfake]
🎨 Technical Highlights
- 🔍 MTCNN Integration: Robust face detection even in challenging conditions
- ⚡ EfficientNet Backbone: Optimal balance between accuracy and computational efficiency
- 🎲 Advanced Augmentation: Comprehensive data augmentation for better generalization
- 📊 Comprehensive Metrics: EER, accuracy, precision, recall, and F1-score tracking
- 🔄 Transfer Learning: Leverage pre-trained ImageNet weights for faster convergence
🤝 Contributing
We welcome contributions from the community! Here's how you can help:
- 🍴 Fork the repository
- 🔨 Create a new branch (
git checkout -b feature/AmazingFeature) - 💾 Commit your changes (
git commit -m 'Add some AmazingFeature') - 📤 Push to the branch (
git push origin feature/AmazingFeature) - 🎉 Open a Pull Request
💪 Areas for Contribution
- 🎯 Implement new SOTA detection methods (Frequency-Aware, LAA-Net, etc.)
- 📊 Add more comprehensive evaluation metrics
- 🎨 Improve data augmentation strategies
- 📚 Expand documentation and tutorials
- 🐛 Report bugs and suggest features
- 🌐 Add support for more deepfake datasets
- ⚡ Optimize inference speed
📖 Citation
If you use this project in your research, please cite:
@software{deepfake_efficientnet_2024,
author = {Umit Kacar},
title = {DeepFake-EfficientNet: AI-Powered DeepFake Detection},
year = {2024},
publisher = {GitHub},
url = {https://github.com/umitkacar/DeepFake-EfficientNet}
}📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
Special thanks to the amazing open-source community and researchers:
| 🔬 | 💻 | 📚 | 🎓 |
|---|---|---|---|
| MTCNN | EfficientNet | DeepfakeBench | Papers with Code |
📞 Contact & Support
🌟 If you find this project useful, please consider giving it a star! 🌟
Made with ❤️ by the AI Research Community
Fighting misinformation one detection at a time 🛡️



