Framingham_CVD_Risk

Healthcare Classification Problem

Index:

Background
Problem Statement
Data Preprocessing

Preliminary Analysis

Data Distribution and Outliers

Categorical Variables

Numerical Variables

Missing Values and Imputation
Correlation Analysis
Normality Check
Undersampling Data
Transformation Pipeline
Modeling

KNN Classifier

Logistic Regression

Decision Tree Classifier

Random Forest Classifier

Bernoullis Naive Beyes Classifier

Bagging Classifiers

Performance Evaluation
Results
CHallenges and Limitations
Future Scope

About Project:

Identifying people at risk of heart disease and making sure they receive proper treatment can prevent these deaths.
Risk startification with the aid of machine learning methods to identify people at risk of having CVD can prove a better preventive, prognostic and management tool for the population.

Framingham Heart Study (FHS)

The Framingham Heart Study is a long term prospective study of the etiology of cardiovascular disease among a population of free living subjects in the community of Framingham, Massachusetts in US. The data collected can be studied to identify risk factors and their joint effects.
The given dataset is a subset of the longitudinal data collected as part of FHS and includes laboratory, clinic, questionnaire, and adjudicated event data on 4,434 participants from which 10-year coronary heart disease risk has been noted over years of surveillance in the participants.
Original current data source
Available on request here - Link - https://biolincc.nhlbi.nih.gov/teaching/

Objective of the study:

The goal of the analysis is to predict whether the participant has 10-year risk of developing (CHD) coronary heart disease based on current data on risk factors for a participant.

Questions to ask:

Which risk factors do the dataset have?
How is the correlation of risk factors with our target value?
How is our data distributed based on demographic data (sex, age, education level)?
How is the behavioural data represented in our data?
Does our target variable have balanced representation in our dataset?
Applicability of data in view of population demographics

Acknowledgement : Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC), The National Heart, Lung, and Blood Institute (NHLBI), NHI for providing data at request.

sayaliba01/Framingham_CVD_Risk

Framingham_CVD_Risk

Index:

About Project:

Framingham Heart Study (FHS)

Objective of the study:

On this page

Languages

Contributors