"topic:knn-imputer" — Search

28 results for “topic:knn-imputer”

miriamspsantos/heterogeneous-distance-functions

A collection of heterogeneous distance functions handling missing values.

distance-functionsdistance-measuresheomheterogeneous-datahvdmknnknn-imputationknn-imputermachine-learningmissing-datamissing-valuesresearch-paper

SebastianRokholt/Data-Science-Projects

A repository for various Data Science projects I've worked on, both university-related and in my spare time.

Jupyter Notebook40Updated 6 months ago

backpropagationdata-sciencedata-science-projectsdeep-learninggephihouse-price-predictionknn-imputermachine-learningmatplotlibmplfinancemultinomial-naive-bayesnetwork-analysispandaspythonpytorchseleniumstochastic-gradient-descentsvmweb-sciencexgboost

SINGHxTUSHAR/Sensor-Fault-Detection

Data fetched by wafers is to be passed through the machine learning pipeline and it is to be determined whether the wafer at hand is faulty or not apparently obliterating the need and thus cost of hiring manual labour.

Jupyter Notebook30Updated 1 year ago

classification-algorithmdeployment-docsflask-apigradientboostingclassifierknn-imputerpipelinesrandomforrestclassifiersimple-imputersvc-modelxgbclassifier

MariaDimopoulou/Churn-Prediction-Customer-Segmentation-in-E-Commerce

This project focuses on predicting customer churn in an e-commerce setting using machine learning techniques.

Jupyter Notebook31Updated 2 years ago

classificationclusteringdbscankmeans-clusteringknn-imputermatplotlibpandaspcaroc-curveseabornsilhouette-scoresmotetsnexgboost

themrityunjaypathak/Feature-Engineering

Feature Engineering with Python

Jupyter Notebook30Updated 2 months ago

column-transformerdata-normalizationdata-standardizationdummy-variablesimbalanced-dataiqrknn-imputerlabel-encodingmodified-zscoreonehot-encodingordinal-encodingoutlier-removalpipelinesimple-imputerzscore

ZL63388/data-preparation-codes

This repository is a collection of basic code templates for Data Preparation. All codes I am sharing are from the practical exercises I did from the Data Science Infinity Program.

Python21Updated 4 years ago

feature-scalingfeature-selectionknn-imputeronehot-encodingoutlier-detectionpandassimpleimputer

KasiMuthuveerappan/CAB-Ensemble-Learning-CHURN-Prediction

📘 This repository predicts OLA driver churn using ensemble methods—Bagging (Random Forest) and Boosting (XGBoost)—with KNN imputation and SMOTE. It reveals city-wise churn trends and key performance drivers, powering smarter, data-backed retention strategies for the ride-hailing industry.

Jupyter Notebook10Updated 10 months ago

bagging-ensembleboosting-ensembledecision-tree-classifierexploratory-data-analysisknn-imputerrandom-forestsmote-oversamplerxgbclassifierxgboost

mahnoorsheikh16/Credit-Card-Default-Prediction

This project focuses on predicting whether a customer will default on their credit card payment in the upcoming month. Utilizing historical transaction data and customer demographics, the project employs various machine learning algorithms to distinguish between risky and non-risky customers for better credit risk management.

Jupyter Notebook10Updated 10 months ago

encodinghiplotimblearnjsonknn-imputerlogistic-regressionmatplotlibnumpypandaspca-analysisplotlyscipyseabornsklearnsmotestreamlitsupport-vector-machinestimeseries-forecastingvisualizationxgboost-classifier

zuhaib1214/Feature-Engineering

This repository is totally focused on Feature Engineering Concepts in detail, I hope you'll find it helpful.

Jupyter Notebook10Updated 2 years ago

binarizationdiscritisationfeature-engineeringfrequent-value-imputationiterative-imputerknn-imputerlabelencodermean-median-imputationnormalisationonehot-encodingordinal-encodingpercentile-methodprincipal-component-analysissimpleimputerstandardizationwinsorizationz-score

nf-i/data-imputation-python

Data imputation is used when there are missing values in a dataset. It helps fill in these gaps with estimated values, enabling analysis and modeling. Imputation is crucial for maintaining dataset integrity and ensuring accurate insights from incomplete data.

Python00Updated 2 years ago

data-imputationknn-imputermice-imputerpythonsimple-imputersklearnsklearn-impute

tjl0005/Imputing-Health-Data

What Are the Challenges and Solutions of Missing Data in Electronic Health Records?

Jupyter Notebook00Updated 6 months ago

deep-learninggainhealth-dataimputationknn-imputermicemiwaemortality-prediction

AmbreenMahhoor/What-Is-Complete-Case-Analysis-Or-CCA

No description provided.

Jupyter Notebook00Updated 1 year ago

arbitrary-value-imputationautomatically-select-imputer-parametersccacomplete-case-analysisfrequent-value-imputationhandling-missing-valueknn-imputermean-median-imputationmissing-category-imputationmissing-indicatorrandom-sample-imputation

Allen-Ho-0302/First-Time-Eligible-Arbitration-Salary-Prediction

Modelling the relationship between a player’s first-time eligible arbitration salary and multiple variables.

Jupyter Notebook00Updated 3 years ago

heatmap-visualizationknn-imputerlightgbm-regressorpythonrandom-forestsmogn

ntyblco/ML_Prediction_RF_KNN

Predicting employee burnout using machine learning algorithms: Random Forest and k-Nearest Neighbors.

00Updated 1 year ago

burnoutknn-imputerknn-regressionmachine-learningrandom-forest

sayukiusui/Capstone-IDSCP

My Capstone for the HarvardX Course "Introduction to Data Science with Python"

Jupyter Notebook00Updated 1 year ago

covid-19-data-analysiscovid19datadata-analysisdata-analysis-pythondata-sciencedata-science-projectsdata-visualization-pythondatasetjupyter-notebookknnknn-imputerlogistic-regressionmatplotlib-pyplotpython3scikit-learn-pythonscipy

SamKazan/fraud-detection-ml

Machine learning models for enhanced fraud detection in e-commerce transactions, exploring feature engineering, distance prediction, and clustering analysis.

Jupyter Notebook00Updated 1 year ago

clusteringdata-sciencedata-visualizationdataanalyticsdbscanedahierarchical-clusteringkmeans-clusteringknn-imputermatplotlibmlxtendpythonscikit-learnseabornxgboost

bortch/second_hand_UK_car_challenge

Kaggle UK Used Car challenge

Python00Updated 4 years ago

kaggleknn-imputermachine-learningrandom-forest

YD5463/TabularDataProject

we perpuse a method to fill nan values using clustering

Jupyter Notebook00Updated 3 years ago

clusteringdbscan-clusteringknn-imputerpython

dfavenfre/customer_deposit_classifier

Streamlit app developed for bank customer deposit prediction, using a fine-tuned XGBClassifier model.

Jupyter Notebook00Updated 2 years ago

bankingfinanceknn-imputerrfecvsmotexgboost-classifier

Seghelicious/Cars45

No description provided.

Jupyter Notebook00Updated 5 years ago

correlation-coefficientcross-validationdata-cleaningextreme-valuesgrid-search-cvknn-imputerknn-regressorlog-transformmodel-developmentnormalizationpipelinespreprocessingrandom-forest-regressorregressionstandard-scalerstandardization

Gui-Sitton/Zyfra

The company develops efficiency solutions for heavy industry. The model should predict the amount of pure gold extracted from gold ore. You have the data on extraction and purification. The model will help optimize production and eliminate unprofitable parameters.

Jupyter Notebook00Updated 2 years ago

data-scienceknn-imputerknn-regressionmachine-learningpredictive-modelingpython

AyushTyagi1610/Credit-Risk-Modelling

Built a model to determine the risk associated with extending credit to a borrower. Performed Univariate and Bivariate exploration using various methods such as pair-plot and heatmap to detect outliers and to monitor the behaviour and correlation of the features. Imputed the missing values using KNN Imputer and implemented SMOTE to address the imbalanced data. Trained the model using KNN, Decision Trees, Logistic Regression and Random Forest to achieve the best accuracy of 93%.

Jupyter Notebook00Updated 3 years ago

datacleaningknn-imputerpandaspythonrandom-forestsmote-oversamplervisualisation

ZG3Z/bts-weather-clustering

No description provided.

Jupyter Notebook00Updated 2 years ago

aglomerative-hierarchical-clusteringclusteringdataintegrationgeolocationkmeans-clusteringknn-imputernominatimplotly-express

kritika755/wafer_circleci

This flask web app is used to detect if a wafer(sensor chip) is default or not based on sensor readings.

Python00Updated 4 years ago

circlecidockerflaskherukoknn-imputermachine-learning-algorithmsnumpypandaspythonrandom-forestsqlitexgboost

NMARGOS/HousePricePrediction

[Kaggle Submission] -Using XGBRegressor with shap, grid search and hyperopt to predict house prices