Obsidian Watch: Real-Time Fraud Detection Dashboard

Obsidian Watch is a full-stack application that provides a real-time dashboard for monitoring financial transactions and detecting fraud using a machine learning model.

This system simulates a stream of financial data, scores each transaction for fraud risk using an XGBoost model, and visualizes the results on a live dashboard.

🚀 Key Features

Real-Time Transaction Monitoring: A live-updating table displays new transactions as they happen.
ML-Powered Fraud Scoring: An XGBoost model, trained on historical data, assigns a risk score to every incoming transaction.
Live KPI Dashboard: Key metrics like Total Transactions, Threats Detected, and Secure Transactions are updated in real-time.
Data-Driven Visualizations: Live charts show the distribution of risk and a weekly analysis of safe vs. suspicious transactions.
Streaming Data Architecture: Uses Kafka as a message broker to handle the high-throughput data stream, with a Flask backend and React frontend.

⚙️ System Architecture

The project is divided into three main components that work together:

Frontend (React): A Vite + React application that serves as the user's dashboard. It connects to the backend via Socket.IO to receive live data and uses Recharts to render the charts.
Backend (Flask): A Flask-SocketIO server written in Python. It acts as a consumer for the Kafka data stream. When it receives a new transaction from Kafka, it immediately broadcasts that data to all connected frontend clients via WebSockets.
Data Pipeline (Kafka & Producer):
- Docker Compose is used to launch Zookeeper and Kafka services.
- A separate Python script (e.g., producer.py, which is not part of the backend server) is responsible for simulating, scoring, and producing data. It loads the trained ML model (xgb_model.pkl), generates a transaction, calculates its fraud score, and sends it to the transactions Kafka topic.

Data Flow:
Data Producer (Python) → ML Model (xgb_model.pkl) → Kafka Topic ('transactions') → Backend (app.py) → Socket.IO → Frontend (React)

🤖 Machine Learning Model

The fraud detection model is an XGBoost Classifier.

Training: The model was trained in the backend/XGBoost_Training.ipynb notebook.
Dataset: It was trained on the sample_cleaned2.csv dataset, which includes features like amount, day_of_week, merchant_category, account_age, etc..
Performance: The current model's performance (Accuracy: ~49%, ROC-AUC: ~0.50) indicates it is not performing better than random chance. This suggests a need for better data, feature engineering, or model tuning.
Artifact: The trained model is saved as backend/xgb_model.pkl and is used by the data producer for inference.

🛠️ How to Run

To run the full application, you will need Node.js/npm, Python 3, and Docker installed.

1. Start the Infrastructure (Kafka)

First, start the Zookeeper and Kafka brokers using Docker.

# Navigate to the backend directory
cd backend/

# Start the services in detached mode
docker-compose up -d

2. Run the Backend Server

In a new terminal, start the Flask-SocketIO server. This server will connect to Kafka and wait for messages.

# Make sure you are in the backend/ directory
cd backend/

# (Recommended) Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

# Install Python dependencies
pip install flask flask-socketio kafka-python eventlet

# Run the server
python app.py

You should see a message indicating the server is running and the Kafka consumer has started.

3. Run the Frontend Dashboard

In a new terminal, go to the project's root directory to install and run the React app.

# Go back to the project's root directory
cd ..

# Install frontend dependencies
npm i

# Start the development server
npm run dev

Your browser should open to http://localhost:3000 (or a similar port). You will see the dashboard, but it will be empty.

4. Start the Data Producer

To see the dashboard in action, you must start the data producer script (e.g., producer.py) to simulate and send transactions to Kafka.

(Note: You will need to ensure this producer script has its dependencies, like kafka-python, pandas, and joblib, installed in its environment.)

# In a new terminal, from the backend/ directory
cd backend/

# (Activate your virtual environment if not already)
# source venv/bin/activate

# Run the producer script
python producer.py

Once the producer is running, you will see transactions, charts, and KPIs populate on the dashboard in real-time.

chipkarsaish/5th_Sem_Mini_Project