Distributed_ML_Sagemaker_Pipelines

An end-to-end machine learning pipeline built on AWS SageMaker Pipelines, designed to support parallel model development and batch scoring on distributed, containerized infrastructure.

Overview
Architecture
Pipeline Stages
How to Run
Key Takeaways

Overview

This project demonstrates the use of SageMaker Pipelines to operationalize a machine learning workflow that includes:

Feature engineering
Model training with XGBoost
Model evaluation based on MSE threshold
Conditional model registration
Offline batch scoring using SageMaker Batch Transform

Ideal for MLOps teams looking to streamline experimentation, ensure consistency in deployment workflows, and scale processing across compute instances.

Architecture :

Parameters:

⚙️ Pipeline Stages

Stage	Description
`Processing`	Executes `preprocessing.py` to clean and split data
`Training`	Trains XGBoost model on training set
`Evaluation`	Evaluates model against validation set using MSE
`Register Model`	Saves model if MSE < threshold
`Batch Transform`	Scores batch data using newly trained model

Take Aways:

With the learnings from this experiment, we successfully implemented parallel model development and scoring pipelines for four models—supporting both Purchase and Refinance scenarios in production.

▶️ How to Run

-->Clone the repo: git clone https://github.com/krishnamami/Distributed_ML_Sagemaker_Pipelines.git

-->pip install -r requirements.txt

-->python sage_maker_pipeline.py

Fine_Tuning_LLM

Markov_Chain_Attribution

Multi Agent Anamoly Detection

Author
Krishna Goud

Head of Data Engineering & MLOps | Rocket LA LinkedIn

krishnamami/Distributed_ML_Sagemaker_Pipelines

Distributed_ML_Sagemaker_Pipelines

Table of Contents

Overview

Architecture :

⚙️ Pipeline Stages

Take Aways:

▶️ How to Run

On this page

Languages

Contributors

krishnamami/Distributed_ML_Sagemaker_Pipelines

Distributed_ML_Sagemaker_Pipelines

Table of Contents

Overview

Architecture :

⚙️ Pipeline Stages

Take Aways:

▶️ How to Run

Related Projects

On this page

Languages

Contributors