UniPredict-ML/UniPredict
A machine learning based university degree recommendation system
UniPredict
A Machine Learning-Based University Degree Recommendation System
Note
This repository contains the complete source code for UniPredict, a tool designed to forecast university admission Z-score cutoffs and provide personalized degree recommendations.
Project Structure
backend/: Houses the backend API and related services.frontend/: Contains the user interface and all frontend logic.ml/: Includes the Machine Learning module for model training and predictions.data/: Stores data files used across the project.infra/: Holds infrastructure configurations.
Machine Learning Module
For more information, refer to the ML specific README.
Overview
The ML module is the core of UniPredict, leveraging a LightGBM regression model to predict university admission Z-score cutoffs. It provides students with personalized degree recommendations based on historical data from Sri Lankan universities.
Key Features
- Z-Score Prediction: Forecasts admission cutoffs for specific degree programs.
- Smart Recommendations: Identifies the top 5 most accessible degrees based on a user's Z-score.
- Multi-Stream Support: Accommodates various academic streams, including Biological Science, Physical Science, Commerce, Arts, and Technology.
- District-Level Analysis: Accounts for geographical differences in admission patterns.
- Robust Input Validation: Ensures data integrity through comprehensive validation, including fuzzy matching for user inputs.
Folder Structure
ml/
├── cops/ # Z-score cutoff data, organized by year
├── model/ # Trained LightGBM model (model.joblib)
├── perfs/ # Student performance data, categorized by stream and year
├── processed_datasets/ # Cleaned and feature-engineered datasets
├── streams/ # Data mapping degrees to academic streams
├── predict.py # Core prediction and recommendation logic
└── explore.ipynb # Jupyter Notebook for data analysis and exploration
How to Run
-
Install Dependencies:
uv sync
Alternatively, you can install packages individually:
pip install pandas lightgbm scikit-learn joblib
-
Process Data & Train Model:
Execute the data processing pipeline to prepare the datasets and train the model. -
Make Predictions:
python predict.py
-
Explore Data:
Launch Jupyter and openexplore.ipynbto delve into the data analysis process.
Core Functions
predict(degree, stream, district): Predicts the Z-score cutoff for a specified degree.get_top_5_accessible_degrees(user_z_score, stream, district): Returns a ranked list of the most accessible degrees for a user.
Tech Stack
- Algorithm: LightGBM Regressor
- Libraries: pandas, lightgbm, scikit-learn, joblib
- Key Features: District, Stream, Degree, Time Index, Pass Rate
- Validation: Time-based splitting with MAE and R² metrics for model evaluation.
Backend Module
For more information, refer to the Backend specific README.
API Endpoint: /recommend/
This POST endpoint is designed to provide the top 5 most accessible university degree programs based on a student's Z-score, academic stream, and district.
Tech Stack
- Framework: FastAPI
API Documentation
The interactive API documentation is available at:
GET /docs/
Endpoint URL
POST /recommend/
Warning
The request body must adhere to the AccessibleDegrees model format.
Request Body (JSON)
{
"user_z_score": 1.5,
"stream": "Physical Science",
"district": "Colombo"
}Example Response (JSON)
A list of up to 5 degree programs with their predicted cutoff scores:
{
"recommend": [
{
"degree": "ENGINEERING UNIVERSITY OF PERADENIYA",
"predicted_cutoff": 1.2125,
"margin": 0.2875
},
{
"degree": "ENGINEERING UNIVERSITY OF SRI JAYEWARDENEPURA",
"predicted_cutoff": 1.2125,
"margin": 0.2875
},
// ... other recommendations
]
}Folder Structure
backend/
├── api/
│ ├── models/
│ ├── routers/
│ └── services/
├── main.py
├── pyproject.toml
└── README.md
Backend Setup & Installation (using uv)
-
Navigate to the Backend Directory:
cd backend -
Create and Activate a Virtual Environment (Recommended):
python3 -m venv .venv source .venv/bin/activate -
Install Dependencies:
uv pip install -r requirements.txt
Or, to sync all dependencies from
uv.lock:uv sync
-
Start the FastAPI Server:
uvicorn main:app --reload
-
Access API Documentation:
Open http://localhost:8000/docs in your browser.
Frontend Module
For more information, refer to the Frontend specific README.
Overview
The UniPredict frontend is a React-based web application that provides a user-friendly interface for students to interact with the prediction model. It connects to the backend API and displays the results in a clear and intuitive dashboard.
Tech Stack
- Framework: React
- Styling: Bootstrap
- API Communication: Connects to the FastAPI backend.
Folder Structure
frontend/
├── public/
├── src/
│ ├── components/
│ ├── App.js
│ ├── index.js
│ └── index.css
├── package.json
└── README.md
Setup & Installation
-
Navigate to the Frontend Directory:
cd frontend -
Install Dependencies:
npm install
-
Start the Development Server:
npm start
-
Access the Application:
Open http://localhost:3000 in your browser.
Contributors
| Name | Contribution |
|---|---|
| 22ug1-0480 H.A.L.Ruwanya | Team Lead, Frontend development |
| 22ug1-0238 R.K.N.R. Ranasinghe | Backend development, Backend model integration |
| 22ug1-0093 - M C R Mallawaarchchi | Handled Jupyter notebook (Model training), Helped data gathering |
| 22ug1-0499 - Dunal Senitha De Mel | Handled Jupyter notebook (Data preprocessing), Helped data gathering |
| 22ug1-0587 N.M.R.D.Narasingha | Data gathering, Documentation, Frontend development |
| 22ug1-0134 W.A.D.R. Weerasinghe | Data gathering, Model development (Sort function), Presentation, Documentation |
| 22ug1-0849 S.M.A.Nisansala | Presentation, Documentation |
| 22ug1-0559 K.K.R.Shehara | Presentation, Documentation |
| 22ug1-0530 S.G.T.A.Anusarani | Presentation, Documentation |
| 22ug1-0487 P.M.V.M.Didulani | Presentation, Documentation |