Amirreza81/Graduate-Admission-Prediction
A Comparison of Regression Models for Predicting Graduate Admissions
Graduate Admission Prediction
A Comparison of Regression Models for Predicting Graduate Admissions
📌 Overview
This repository presents a machine learning–based system for predicting graduate admission chances. The work is based on the research paper “A Comparison of Regression Models for Prediction of Graduate Admissions” (Acharya, Armaan, Antony, 2019) and supporting materials.
The project evaluates multiple regression models to estimate the probability of admission into graduate schools, helping students classify universities as ambitious, moderate, or safe options. Unlike existing predictors that rely on outdated or limited data, this approach uses originally curated datasets and systematically compares model performance using error metrics.
👉 Note: This project is developed as part of the Applied Data Science course at Sharif University of Technology.
👉 Repository Link: GitHub Repo
🎯 Problem Statement
Students applying for Master’s programs often face difficulty in selecting universities:
- Unreliable guidance from consultancies or seniors.
- Limited predictors that rely only on past admissions.
- Subjectivity in admissions criteria that vary yearly.
Objective
- Develop a machine learning solution that predicts the likelihood of admission.
- Provide data-driven insights for students to make better application decisions.
- Understand how each parameter (GRE, TOEFL, GPA, SOP, LOR, Research Experience) influences the admission probability.
📂 Dataset
- Sourced from Kaggle: Graduate Admission Dataset (Acharya, 2018).
- Parameters included:
- GRE Score
- TOEFL Score
- University Rating
- Statement of Purpose (SOP)
- Letter of Recommendation (LOR)
- Undergraduate GPA (CGPA)
- Research Experience (binary: 0/1)
Preprocessing
- Normalization applied to scale features.
- Balanced mix of categorical and numerical attributes.
- Dataset primarily reflects Indian student profiles but adaptable to other systems.
- A second version of the dataset is planned with 200+ additional profiles.
⚙️ Methodology
The study compares four regression models:
-
Linear Regression
- Multiple Linear Regression applied.
- Strong performance due to linear dependencies in dataset.
-
Support Vector Regression (SVR)
- RBF/Gaussian and Polynomial kernels tested.
- Degree-3 polynomial kernel chosen to balance bias and variance.
-
Decision Tree Regression
- Explored max features, sqrt, and log2 splits.
- Performed better with reduced features, but prone to overfitting.
-
Random Forest Regression
- Ensemble of decision trees.
- Tuned using number of estimators (best at 225).
- Captured some non-linear interactions effectively.
Evaluation Metrics
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- Mean Squared Logarithmic Error (MSLE)
- R² Score
📊 Results
| Model | MSE | R² Score |
|---|---|---|
| Linear Regression | 0.0048 | 0.725 |
| Support Vector Regression | 0.0072 | 0.644 |
| Decision Tree Regression | 0.0087 | 0.501 |
| Random Forest Regression | 0.0058 | 0.660 |
Key Findings
- Linear Regression performed best overall with lowest error and highest R².
- Random Forest was competitive and useful for capturing non-linearities.
- Decision Trees underperformed due to overfitting.
- SVR was sensitive to kernel choice and polynomial degree.
✅ Conclusion
- Linear Regression is most suitable for this dataset, owing to linear relationships between parameters and admission chances.
- Random Forest provides a robust alternative with good generalization.
- Results indicate that higher test scores, GPA, and strong recommendations correlate strongly with admission chances.
- Outliers (low profiles admitted to top schools) slightly affect model performance but reflect real-world unpredictability.
🔮 Future Work
- Expand Dataset: Add diverse and international student profiles.
- Handle Outliers: Introduce more atypical cases to reduce bias.
- Advanced Models: Explore Deep Neural Networks to capture non-linear and subjective admission factors.
- Deployment: Build a user-friendly web application where students can input their profiles and instantly receive admission predictions.
📖 References
- Mohan S. Acharya, Asfia Armaan, Aneeta S. Antony, “A Comparison of Regression Models for Prediction of Graduate Admissions”, IEEE ICCIDS, 2019. DOI:10.1109/ICCIDS.2019.8862140
- Dataset: Graduate Admissions - Kaggle