PY-MLXL: Generalized Machine Learning Pipeline with Dash UI

Overview

PY-MLXL is a comprehensive Machine Learning pipeline tool wrapping a powerful backend with an intuitive Dash Plotly user interface. It is designed to streamline the end-to-end ML workflow, from data ingestion to final model deployment.

Key Features

Exploratory Data Analysis (EDA): Interactive visualization of numerical and categorical features.
Data Transformation: Automated cleaning, imputation, scaling, and transformation.
Feature Selection: Multiple methods including ANOVA, Mutual Information, and Recursive Feature Elimination.
Baseline Modeling: Quick comparison of multiple algorithms (Random Forest, XGBoost, LightGBM, SVM, etc.).
Hyperparameter Tuning: Automated tuning using Optuna.
Final Model Training: Train and save production-ready models.
Result Visualization: ROC Curves, Confusion Matrices, and Feature Importance plots.

Installation

Clone the repository.
Install dependencies:
```
pip install -r requirements-dev.txt
```
Run the application:
```
python app.py
```

Usage Flow

Upload Data: Navigate to the EDA section and upload your raw CSV data (e.g., data/raw/paint_quality_assurance_data.csv).
Transform Data: Go to Classification -> Data Transform. Select target variable and transformation steps.
Select Features: Run Feature Selection to identify top predictors.
Baseline: Run Baseline Modeling to find the best performing algorithms.
Tune: Optimize the best model using the Hyperparameter Tuning module. You can either automatically tune the best performers from the Baseline step or manually select specific models and feature sets.
Train: Train the final model on the full dataset.
Visualize: Analyze the final model's performance in the Visualization tab.

Technologies

Frontend: Dash, Plotly, Dash Bootstrap Components
Backend: Scikit-Learn, Imbalanced-Learn, Optuna, XGBoost, LightGBM
Data Handling: Pandas, NumPy

default741/py-mlxl

PY-MLXL: Generalized Machine Learning Pipeline with Dash UI

Overview

Key Features

Installation

Usage Flow

Technologies

On this page

Languages

Contributors