GitHunt
AD

Adhil-Payingalil/ML_loan_approval_prediction

Comparative analysis of 5 ML models for Loan Approval Prediction, assessing the impact of preprocessing and RFE feature selection.

Loan Approval Prediction - A Comparative ML Study

Overview

This project applies machine learning techniques to predict loan approval status using the Kaggle "Loan Status Prediction" dataset. It demonstrates:

  • Comprehensive data preprocessing (imputation, encoding, scaling).
  • Handling class imbalance using SMOTE.
  • Feature selection using Recursive Feature Elimination (RFE).
  • Comparative analysis of five classification models (Logistic Regression, SVM, Naive Bayes, Random Forest, Decision Tree) on both full and RFE-selected feature sets.

Key Techniques & Technologies

  • Python: Pandas, NumPy, Scikit-learn, Imblearn
  • ML Techniques: Data Cleaning, EDA, SMOTE, RFE, Model Evaluation (Accuracy, Precision, Recall, F1, ROC AUC), Classification Algorithms.

Key Finding Highlight

The study found that while using all features generally yielded slightly higher accuracy (SVM/Naive Bayes best at ~74%), RFE significantly improved Logistic Regression's performance (73% accuracy), demonstrating the nuanced impact of feature selection across different algorithms.


Link to Project Documentation: (https://lapis-school-f5e.notion.site/Loan-Approval-Prediction-A-Comparative-Study-1ecca101e469809d8b9feab397c40a47?pvs=4)

Languages

Jupyter Notebook100.0%

Contributors

Created May 7, 2025
Updated May 7, 2025