GitHunt
AC

Acquarts/ml-medical-insurance-costs-predictor-app

Web application to predict medical insurance costs using Machine Learning, deployed on Google Cloud Run.

πŸ₯ Medical Insurance Cost Predictor

Python
Streamlit
scikit-learn
Google Cloud
Docker

Web application to predict medical insurance costs using Machine Learning, deployed on Google Cloud Run.

πŸ”— Live Demo: insurance-predictor-562289298058.us-central1.run.app

✨ Features

  • ML-based medical insurance cost prediction
  • Interactive web interface with Streamlit
  • Gradient Boosting model with 90% accuracy (RΒ²)
  • Deployed on Google Cloud Run

πŸ› οΈ Tech Stack

Category Technologies
ML scikit-learn, XGBoost, pandas, numpy
Web Streamlit
Cloud Google Cloud Run, Cloud Build
Containers Docker

πŸ“ Project Structure

β”œβ”€β”€ app.py                 # Streamlit application
β”œβ”€β”€ train.py               # Training script
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ Dockerfile             # Container for Cloud Run
β”œβ”€β”€ Dockerfile.training    # Container for training
β”œβ”€β”€ .env                   # Environment variables (don't push to git)
β”œβ”€β”€ data/
β”‚   └── insurance.csv      # Dataset
└── model/
    β”œβ”€β”€ model.joblib       # Trained model
    └── feature_names.joblib

πŸš€ Local Installation

# 1. Clone repository
git clone https://github.com/your-username/ai-insurance-cost-predictor.git
cd ai-insurance-cost-predictor

# 2. Create virtual environment
python -m venv venv
venv\Scripts\activate  # Windows
# source venv/bin/activate  # Linux/Mac

# 3. Install dependencies
pip install -r requirements.txt

# 4. Download dataset from Kaggle
# https://www.kaggle.com/datasets/mirichoi0218/insurance
# Save as data/insurance.csv

# 5. Train model (optional, already included)
python train.py --data-path=data/insurance.csv --model-dir=model

# 6. Run application
streamlit run app.py

App will be available at: http://localhost:8501

☁️ Deploy to Google Cloud Run

Requirements

  • Google Cloud account with billing enabled
  • gcloud CLI installed and configured

Steps

# 1. Set project
gcloud config set project YOUR-PROJECT-ID

# 2. Enable APIs
gcloud services enable cloudbuild.googleapis.com run.googleapis.com storage.googleapis.com containerregistry.googleapis.com

# 3. Build image in the cloud
gcloud builds submit --tag gcr.io/YOUR-PROJECT-ID/insurance-app .

# 4. Deploy to Cloud Run
gcloud run deploy insurance-predictor --image gcr.io/YOUR-PROJECT-ID/insurance-app --platform managed --region us-central1 --allow-unauthenticated --memory 1Gi --port 8080

πŸ“Š ML Model

Performance

Metric Value
RΒ² Score 0.90
MAE $2,530
RMSE $4,269

Feature Importance

  1. 🚬 Smoker (~70%)
  2. βš–οΈ BMI (~15%)
  3. πŸ“… Age (~10%)
  4. πŸ“ Other (~5%)

πŸ“‹ Input Variables

Variable Type Description
age int Age (18-100)
sex str Sex (Male/Female)
bmi float Body Mass Index
children int Number of children (0-5)
smoker str Smoker (Yes/No)
region str Region (Northeast/Northwest/Southeast/Southwest)

πŸ’° Estimated GCP Costs

Service Approximate Cost
Cloud Run ~$0-5/month
Cloud Build ~$0.003/build
Container Registry ~$0.10/GB

πŸ“‚ Dataset

Medical Cost Personal Dataset from Kaggle:
https://www.kaggle.com/datasets/mirichoi0218/insurance

πŸ‘€ Author

Adrian Zambrana

πŸ“„ License

MIT License

Acquarts/ml-medical-insurance-costs-predictor-app | GitHunt