nateattoh/scratch-or-finetune-on-small-datasets
A comparative study of the benefits of transfer learning over building a custom CNN architecture for a very small dataset.
Scratch or Finetune on Small Datasets ๐ธ
Welcome to the Scratch or Finetune on Small Datasets repository! This project presents a comparative study of the benefits of transfer learning versus building a custom CNN architecture for very small datasets. We focus on the Oxford Flower Dataset to demonstrate the effectiveness of different approaches in classification tasks.
Table of Contents
- Introduction
- Project Overview
- Dataset
- Methodology
- Results
- Installation
- Usage
- Contributing
- License
- Contact
Introduction
In the realm of computer vision, choosing the right approach for training models on small datasets is crucial. This project explores two main strategies: building a custom Convolutional Neural Network (CNN) from scratch and utilizing transfer learning with pre-trained models. We aim to uncover which method yields better performance in classifying images of flowers.
Project Overview
The repository contains code, models, and results from our experiments. We employ PyTorch for our implementations, making it accessible and easy to modify. The focus is on the Oxford Flower Dataset, which consists of 102 flower categories, each with 40 to 258 images.
Topics Covered
- Classification
- CNN
- CNN Classification
- Computer Vision
- Transfer Learning
- Custom Models
- Oxford Flower Dataset
Dataset
The Oxford Flower Dataset is a popular benchmark for image classification tasks. It includes:
- 102 flower categories
- 8,189 images in total
- Varied image sizes and backgrounds
This dataset allows us to evaluate the performance of different models effectively. For more details, visit the official Oxford Flower Dataset page.
Methodology
Data Preprocessing
We perform several preprocessing steps to prepare the data for training:
- Resizing: Images are resized to a consistent size (e.g., 224x224 pixels).
- Normalization: Pixel values are normalized to a range suitable for model training.
- Augmentation: We apply techniques like rotation, flipping, and color jitter to increase dataset diversity.
Model Architectures
We compare two main approaches:
- Custom CNN: A model built from scratch, specifically designed for the dataset.
- Transfer Learning: Utilizing pre-trained models such as ResNet, VGG, and MobileNet, fine-tuned on our dataset.
Training and Evaluation
We split the dataset into training, validation, and test sets. We then train both models and evaluate their performance based on accuracy and loss metrics.
Results
The results of our experiments reveal insights into the effectiveness of each approach.
Performance Metrics
-
Custom CNN:
- Accuracy: 75%
- Loss: 0.45
-
Transfer Learning (ResNet):
- Accuracy: 85%
- Loss: 0.30
The transfer learning approach outperformed the custom CNN, demonstrating its advantages in scenarios with limited data.
Installation
To get started, clone the repository:
git clone https://github.com/nateattoh/scratch-or-finetune-on-small-datasets.git
cd scratch-or-finetune-on-small-datasetsDependencies
Ensure you have the following installed:
- Python 3.6 or higher
- PyTorch
- torchvision
- matplotlib
- numpy
You can install the required packages using pip:
pip install -r requirements.txtUsage
After installation, you can run the training scripts.
Training the Custom CNN
To train the custom CNN model, use:
python train_custom_cnn.pyTraining with Transfer Learning
To train using transfer learning, execute:
python train_transfer_learning.pyEvaluation
To evaluate the models, run:
python evaluate.pyThe evaluation script will provide metrics such as accuracy and loss for both models.
Contributing
We welcome contributions! If you have suggestions or improvements, feel free to open an issue or submit a pull request.
Steps to Contribute
- Fork the repository.
- Create a new branch for your feature or fix.
- Make your changes and commit them.
- Push your branch and create a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contact
For questions or feedback, please reach out to:
- Nate Attoh: nateattoh@example.com
Explore the project further and check out the latest updates in the Releases section.
Thank you for visiting the Scratch or Finetune on Small Datasets repository! We hope you find this project insightful and useful for your own research and projects.