28 results for “topic:tree-based-models”
No description provided.
CFXplorer generates optimal distance counterfactual explanations for a given machine learning model.
[NeurIPS 2022] (De-)Randomized Smoothing for Decision Stump Ensembles
Comprehensive benchmark study of feature selection techniques for predictive machine learning models on tabular data. Various feature selection methods are evaluated across different data characteristics and predictive scenarios.
This project uses EEG data to detect schizophrenia, achieving a robust classifier with LGBM, boasting a ROC AUC of 95.96% and an accuracy of 90%
👨💻 This repository shows how machine learning and SHAP can be leveraged to understand the reasons of production downtime ⌛
German Credit Data - 1994
Detects anomalies using the Isolation Forest algorithm, with clear visual comparison between original data and anomaly-marked data in an unsupervised learning setup.
For this project, we will analyze publicly available data from LendingClub.com, which connects borrowers needing money with investors. The goal is to create a model that predicts the likelihood of borrowers repaying their loans. We will focus on Lending Club's data from 2007-2010 to classify and determine the repayment behavior pre-2016.
I and my team participated in the Amazon ML Challenge, a national-level machine learning competition where we tackled real-world data problems and built predictive models using advanced ML techniques.
Tabular classification project with Machine Learning models
Machine learning pipeline for multi-class treatment prediction in lung adenocarcinoma (LUAD) using patient-level molecular profiles, featuring ensemble-based model aggregation, benchmarking across diverse classifier architectures, and systematic performance evaluation.
Project started as submission for MSE course at Praxis Business School, then further worked upon to implement machine learning models with hyperparameter tuning.
Machine Learning Project at Kampus Merdeka Program
Data Science portfolio
Notebooks that document my process of: cleaning NHL data, features engineering, and training models to predict NHL playoff teams in the 2025-2026 season
Automated reasoning 🤖 for CoT prompting 💬 using explainability attributes from tree-based 🌳 models for binary classification on tabular datasets
Orbit Boost is a research-oriented gradient boosting library built from scratch in Python, designed as an experimental alternative to LightGBM, XGBoost, and CatBoost. It introduces oblique projections, BOSS sampling, Newton-style updates, and a ridge-based warm start for improved performance.
This project leverages ML to classify mental health risk signals (potential signs of depression) by analyzing structured profile metadata and unstructured textual data from social platforms, with a focus on user behavior, interactions, and content.
Tree-based models are appealing for price modeling due to their high performance but they can be unstable. Due to competition between insurers, unstable models increase the risk that the overall premium is too small to cover the losses. The thesis propose various strategies for improving the stability of tree-based models.
Usually tree-based and neural network regressors work better for regression tasks than linear regression models, because they can capature complex or subtle non-linear patterns in data.
Data Science Projects.
Project page for "Physics-informed graph neural networks accelerating microneedle simulations towards novelty of micro-nano scale materials discovery" as a part of Romrawin Chumpu's master thesis and publication.
In this course, you will get advanced knowledge on Data Mining. This course begins by providing you the complete knowledge about the introduction of Data Mining. This course is a complete package for everyone wanting to pursue a career in data mining.
Streamlined toolkit for predicting human phenotypes from UK Biobank using tree-based ensembles and linear models. Load high-dimensional SNP and covariate data, select variants via Random Feature Selection to balance accuracy and runtime, and interpret genetic plus socio-demographic feature contributions with SHAP.
Built a churn prediction model using tree-based methods (Decision Tree, Random Forest). Conducted EDA, encoded categorical variables, and visualized feature relationships. Tuned models with GridSearchCV and evaluated them using classification metrics and feature importance.
No description provided.
🛡️ Detect and report fraudulent activities using advanced modeling techniques to enhance security and protect valuable assets.