GitHunt
SH

shruti-sivakumar/Credit-Card-Anomaly-GMM

Unsupervised anomaly detection for credit card fraud using Gaussian Mixture Models (GMMs). Models legitimate behavior via probability density estimation and flags low-likelihood transactions as potential frauds.

Credit Card Fraud Detection via Anomaly Detection (GMM)

This project implements unsupervised anomaly detection on credit card transactions using Gaussian Mixture Models (GMMs). It focuses on identifying fraudulent transactions in a highly imbalanced dataset by modeling the likelihood of legitimate behavior and flagging low-probability anomalies.


Problem Statement

Fraudulent transactions are rare but costly. With only 0.17% fraud in the dataset, traditional classifiers struggle. This project uses probability density estimation to model legitimate behavior and flags outliers as potential frauds — without needing labeled data for training.


Dataset

  • Source: Kaggle – Credit Card Fraud Detection -> download manually and place it in 'data/creditcard.csv'
  • Size: 284,807 transactions
  • Fraud Cases: 492 (~0.172%)
  • Features: PCA-transformed V1–V28 + Time, Amount, and Class (0 = legit, 1 = fraud)

⚙️ Methodology

Feature Selection

Selected top features most correlated with fraud: V14, V17, V11, V4, V15, V13

Distribution Analysis

Used seaborn and matplotlib to visualize class-wise distributions and assess Gaussian fit.

GMM Modeling

  • Trained Gaussian Mixture Model on legitimate transactions using V14 and V17
  • Used Expectation-Maximization (EM) algorithm to estimate parameters

Likelihood Scoring & Thresholding

  • Computed log-likelihood scores for each transaction
  • Set a threshold T to classify low-likelihood samples as fraud

Evaluation

  • Precision, Recall, F1-Score for class 1 (fraud)
  • Plotted Precision-Recall Curve
  • Computed AUCPR = 0.679

Results Summary

Metric Legit Class (0) Fraud Class (1)
Precision 1.00 0.95
Recall 1.00 0.72
F1-Score 1.00 0.82
AUCPR - 0.679

Repository Structure

Credit-Card-Anomaly-GMM/
├── data/
│   └── creditcard.csv              # Dataset (or link in README)
├── anomaly_detection_gmm.ipynb    # Full GMM pipeline notebook
├── README.md
├── LICENSE

Output images are present in the Jupyter notebook.


Author

Built by Shruti Sivakumar — as a focused showcase of probabilistic anomaly detection applied to real-world financial fraud.


License

MIT License – see LICENSE for details.

shruti-sivakumar/Credit-Card-Anomaly-GMM | GitHunt