GitHunt
SU

Subrat1920/Hotel-Reservation-Cancellation-Detection-MLOps

Predict which customers are likely to cancel their reservation or checkout before check-in using historical hotel booking data. This enables hotels to take proactive actions to reduce revenue loss.

Hotel Reservation Prediction

Project Goal:
Predict which customers are likely to cancel their reservation or checkout before check-in using historical hotel booking data. This enables hotels to take proactive actions to reduce revenue loss.


Table of Contents


Project Overview

The Hotel Reservation Prediction project is an end-to-end machine learning solution for predicting early checkouts or cancellations. The pipeline automates data ingestion, preprocessing, model training, and deployment using modern CI/CD practices. A Flask-based web app allows interactive prediction requests via Docker deployment.


Features

  • Predict customer checkout behavior before check-in.
  • ML model trained with LightGBM.
  • Automated preprocessing and feature engineering.
  • Experiment tracking with MLflow.
  • Fully automated CI/CD pipeline using Jenkins, Docker, and Google Cloud Platform (GCP).
  • Integration with GCP Buckets and Container Registry.
  • Role-based access with IAMs and service accounts.
  • Flask web app for serving predictions in real-time.

Tech Stack

  • Programming: Python
  • Data Processing & ML: Pandas, NumPy, LightGBM, Scikit-learn
  • Web App: Flask
  • CI/CD: Jenkins, Docker, GitHub
  • Cloud: Google Cloud Platform (GCP) โ€“ Storage Buckets, Container Registry
  • Experiment Tracking: MLflow
  • Version Control: Git & GitHub

Architecture & Pipeline

  1. Data Ingestion: Raw hotel booking data stored in GCP buckets is ingested via Python scripts.
  2. Data Processing: Cleaning, preprocessing, and feature engineering applied to prepare training datasets.
  3. Model Training: LightGBM model is trained, evaluated, and saved to artifacts/models.
  4. MLflow Tracking: All experiments, metrics, and models are tracked for reproducibility.
  5. Deployment:
    • Flask app serves predictions.
    • Docker image built and pushed to GCP Container Registry.
    • Deployed with full CI/CD pipeline using Jenkins.
  6. Automation & Security:
    • Jenkins pipelines handle automated builds, tests, and deployments.
    • IAM roles and service accounts secure GCP resources.

Installation

  1. Clone the repository:
git clone https://github.com/Subrat1920/Hotel-Reservation-Cancellation-Detection-MLOps.git
cd Hotel-Reservation-Prediction
  1. Install dependencies:
pip install -r requirements.txt
  1. Set environment variables for GCP authentication (if using service accounts):
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"

Usage

A. Local Execution (Training & Prediction)

  1. Start the Flask app:
python app.py
  1. Access the app at http://localhost:5000 to make predictions.

B. Dockerized Deployment

  1. Build the Docker image:
docker build -t hotel-reservation-prediction .
  1. Run the container:
docker run -p 8080:8080 hotel-reservation-prediction
  1. Access at http://localhost:8080.

  2. CI/CD Pipeline
    Jenkinsfile handles automated cloning, testing, building, pushing Docker image to GCP, and deploying.

Project Structure

โ”œโ”€โ”€ ๐Ÿ“ artifacts/
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ models/
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ lgbm_model.pkl
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ processed/
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ process_test.csv
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“„ process_train.csv
โ”‚   โ””โ”€โ”€ ๐Ÿ“ raw/
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ raw.csv
โ”‚       โ”œโ”€โ”€ ๐Ÿ“„ test.csv
โ”‚       โ””โ”€โ”€ ๐Ÿ“„ train.csv
โ”œโ”€โ”€ ๐Ÿ“ config/
โ”‚   โ”œโ”€โ”€ ๐Ÿ __init__.py
โ”‚   โ”œโ”€โ”€ โš™๏ธ config.yaml
โ”‚   โ”œโ”€โ”€ ๐Ÿ model_params.py
โ”‚   โ””โ”€โ”€ ๐Ÿ path_config.py
โ”œโ”€โ”€ ๐Ÿ“ custom_jenkins/
โ”‚   โ””โ”€โ”€ ๐Ÿณ Dockerfile
โ”œโ”€โ”€ ๐Ÿ“ pipeline/
โ”‚   โ”œโ”€โ”€ ๐Ÿ __init__.py
โ”‚   โ””โ”€โ”€ ๐Ÿ training_pipeline.py
โ”œโ”€โ”€ ๐Ÿ“ src/
โ”‚   โ”œโ”€โ”€ ๐Ÿ __init__.py
โ”‚   โ”œโ”€โ”€ ๐Ÿ data_ingestion.py
โ”‚   โ”œโ”€โ”€ ๐Ÿ data_processing.py
โ”‚   โ”œโ”€โ”€ ๐Ÿ exception.py
โ”‚   โ”œโ”€โ”€ ๐Ÿ logger.py
โ”‚   โ””โ”€โ”€ ๐Ÿ model_training.py
โ”œโ”€โ”€ ๐Ÿ“ static/
โ”‚   โ””โ”€โ”€ ๐ŸŽจ style.css
โ”œโ”€โ”€ ๐Ÿ“ templates/
โ”‚   โ””โ”€โ”€ ๐ŸŒ index.html
โ”œโ”€โ”€ ๐Ÿ“ utils/
โ”‚   โ”œโ”€โ”€ ๐Ÿ __init__.py
โ”‚   โ””โ”€โ”€ ๐Ÿ common_functions.py
โ”œโ”€โ”€ ๐Ÿšซ .gitignore
โ”œโ”€โ”€ ๐Ÿณ Dockerfile
โ”œโ”€โ”€ ๐Ÿ“„ Jenkinsfile
โ”œโ”€โ”€ ๐Ÿ“– README.md
โ”œโ”€โ”€ ๐Ÿ app.py
โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt
โ””โ”€โ”€ ๐Ÿ setup.py

Contributing

Fork the repository.
Create a feature branch: git checkout -b feature-name
Commit your changes: git commit -m "Add feature"
Push to branch: git push origin feature-name
Create a Pull Request.

Subrat1920/Hotel-Reservation-Cancellation-Detection-MLOps | GitHunt