mohammadsiam2002/Fraud_detection
Detect financial fraud with ML-powered predictions via an interactive Streamlit app.
Fraud Detection System
**Overview
This project implements an intelligent machine learningβbased system for detecting fraudulent financial transactions.
The system analyzes transaction details such as transaction type, amount, and account balances to automatically classify transactions as fraudulent or legitimate.
The project covers the complete machine learning pipeline, including:
Data analysis and preprocessing
Feature engineering
Model training and evaluation
Model persistence (saving/loading)
Deployment as a web application using Streamlit
**Objectives
Automatically detect fraudulent transactions in financial systems.
Reduce financial losses caused by fraud.
Apply machine learning techniques to real-world transaction data.
Provide a simple and interactive web interface for predictions.
**Dataset
Due to GitHub file size limits, the dataset is not stored in this repository.
You can download it from:
π Dataset Link:
https://drive.google.com/file/d/1RcJRtQYILDjapaUG2okVZvq7i7x47EZA/view?usp=sharing
**Project Workflow
Dataset (CSV)
β
Data Cleaning & Feature Engineering
β
Preprocessing (Scaling & Encoding)
β
Model Training (Logistic Regression)
β
Model Evaluation
β
Model Saved to Disk (.pkl)
β
Streamlit Web App Loads Model
β
User Inputs Transaction β Prediction (Fraud / Not Fraud)
**Dataset Description
The dataset contains financial transaction records, where each row represents one transaction.
Main Features:
type β> Transaction type (TRANSFER, CASH_OUT, PAYMENT, etc.)
amount β> Transaction amount
oldbalanceOrg -> Sender balance before transaction
newbalanceOrig β> Sender balance after transaction
oldbalanceDest β> Receiver balance before transaction
newbalanceDest β> Receiver balance after transaction
isFraud β> Target label (1 = Fraud, 0 = Normal)
Engineered Features:
balanceDiffOrig = oldbalanceOrg β newbalanceOrig
balanceDiffDest = newbalanceDest β oldbalanceDest
**Machine Learning Model
Algorithm: Logistic Regression
Type: Supervised Binary Classification.
Preprocessing:
Numerical features scaled using StandardScaler.
Categorical features encoded using OneHotEncoder.
Full pipeline built using ColumnTransformer and Pipeline.
**Model Evaluation
The model is evaluated using:
Accuracy
Precision
Recall
F1-score
Confusion Matrix
**Web Application
#The trained model is deployed using Streamlit.
#The web app allows the user to:
1-Enter transaction details
2-Click Predict
Instantly see whether the transaction is:
β
Legitimate
β Fraudulent
**Technologies Used
Python
Pandas, NumPy
Scikit-learn
Matplotlib, Seaborn
Streamlit
**Project Structure
fraud-detection/
β
βββ fraud_detection.py # Streamlit web application
βββ analysis_model.py # Model training and saving script
βββ fraud_detection_pipeline.pkl
βββ AIML Dataset (2).csv
βββ README.md